| 
            フクシマ エドワルド フミヒコ
           福島 E.文彦 所属 工学部 機械工学科 職種 教授  | 
      |
| 言語種別 | 英語 | 
| 発行・発表の年月 | 2025/10 | 
| 形態種別 | 学術論文 | 
| 査読 | 査読あり | 
| 標題 | Hetero-Logit Alignment and Ambiguity Fragment Self-Weighting for Speech Emotion Recognition in Human-Robot Interaction | 
| 執筆形態 | 共著 | 
| 掲載誌名 | IEEE Internet of Things Journal ( Early Access ) | 
| 掲載区分 | 国外 | 
| 出版社・発行元 | IEEE | 
| 国際共著 | 国際共著 | 
| 著者・共著者 | Cheng-Shan Jiang; Zhen-Tao Liu; Edwardo F. Fukushima.; Jinhua She | 
| 概要 | Speech Emotion Recognition (SER) is pivotal for achieving empathetic and adaptive Human-Robot Interaction (HRI) within Internet of Things (IoT) ecosystems. However, conventional SER methods face computational inefficiency of large self-supervised speech models for HRI system deployment. Moreover, utterance-level emotion labels introduce segment-level ambiguities, degrading model training. To mitigate these limitations, we proposed the Multi-View Logit-based Heterogeneous Model Alignment (MVLogit-HMA) framework, which transfers knowledge from a large self-supervised Trunk (TKG) encoder to a lightweight Package (PKG) encoder through the joint alignment of instance-level contrastive guidance and class-level prototype relations in both the logit space and probability distribution, thereby harmonizing representations across heterogeneous model architectures. Simultaneously, we introduce Ambiguity Fragment Self-Weighting (AFSW) to dynamically down-weight unreliable segments and enforce discriminative separation between high- and low-ambiguity groups via an adaptive boundary loss. Comprehensive evaluations on the IEMOCAP and RAVDESS datasets confirm the superiority of our method, achieving weighted accuracy (WA) of 74.17% and 90.02%, and unweighted accuracy (UA) of 74.47% and 90.39%, respectively. Furthermore, a preliminary application in a real-world HRI scenario validates the practical viability of our approach. | 
| 外部リンクURL | https://doi.org/10.1109/JIOT.2025.3620303 |