教員業績 - 福島 E.文彦 | 東京工科大学

フクシマ　エドワルド　フミヒコ福島 E.文彦所属工学部機械工学科職種教授
言語種別	英語
発行・発表の年月	2025/10
形態種別	学術論文
査読	査読あり
標題	Hetero-Logit Alignment and Ambiguity Fragment Self-Weighting for Speech Emotion Recognition in Human-Robot Interaction
執筆形態	共著
掲載誌名	IEEE Internet of Things Journal ( Early Access )
掲載区分	国外
出版社・発行元	IEEE
国際共著	国際共著
著者・共著者	Cheng-Shan Jiang; Zhen-Tao Liu; Edwardo F. Fukushima.; Jinhua She
概要	Speech Emotion Recognition (SER) is pivotal for achieving empathetic and adaptive Human-Robot Interaction (HRI) within Internet of Things (IoT) ecosystems. However, conventional SER methods face computational inefficiency of large self-supervised speech models for HRI system deployment. Moreover, utterance-level emotion labels introduce segment-level ambiguities, degrading model training. To mitigate these limitations, we proposed the Multi-View Logit-based Heterogeneous Model Alignment (MVLogit-HMA) framework, which transfers knowledge from a large self-supervised Trunk (TKG) encoder to a lightweight Package (PKG) encoder through the joint alignment of instance-level contrastive guidance and class-level prototype relations in both the logit space and probability distribution, thereby harmonizing representations across heterogeneous model architectures. Simultaneously, we introduce Ambiguity Fragment Self-Weighting (AFSW) to dynamically down-weight unreliable segments and enforce discriminative separation between high- and low-ambiguity groups via an adaptive boundary loss. Comprehensive evaluations on the IEMOCAP and RAVDESS datasets confirm the superiority of our method, achieving weighted accuracy (WA) of 74.17% and 90.02%, and unweighted accuracy (UA) of 74.47% and 90.39%, respectively. Furthermore, a preliminary application in a real-world HRI scenario validates the practical viability of our approach.
外部リンクURL	https://doi.org/10.1109/JIOT.2025.3620303