Learning emotion-discriminative and domain-invariant features for domain adaptation in speech emotion recognition

被引：35

作者：

Mao, Qirong ^{[1
]}

Xu, Guopeng ^{[1
]}

Xue, Wentao ^{[1
]}

Gou, Jianping ^{[1
]}

Zhan, Yongzhao ^{[1
]}

机构：

[1] Jiangsu Univ, Dept Comp Sci & Commun Engn, Zhenjiang, Jiangsu, Peoples R China

来源：

SPEECH COMMUNICATION | 2017年 / 93卷

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Domain adaptation; Speech emotion recognition; Neural network;

D O I：

10.1016/j.specom.2017.06.006

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Conventional approaches for Speech Emotion Recognition (SER) usually assume that the feature distributions between training and test set are identical. However, this assumption does not hold in many real scenarios. Although many Domain Adaptation (DA) methods have been proposed to solve this problem, the conventional emotion discriminative information is ignored. In this paper, we propose a DA based method called Emotion-discriminative and Domain-invariant Feature Learning Method (EDFLM) for SER, in which both the domain divergence and emotion discrimination are considered to learn emotion discriminative and domain-invariant features by using emotion label constraint and domain label constraint. Furthermore, to disentangle the emotion-related factors from the emotion-unrelated factors, we introduce an orthogonal term to encourage the input to be disentangled into two blocks: emotion-related and emotion-unrelated features. Our method can learn emotion-discriminative and domain-invariant features through a back propagation network which uses the acoustic features of INTERSPEECH 2009 Emotion Challenge as the input rather than raw speech signals. Experiments on the INTERSPEECH 2009 Emotion Challenge two-class task show that the performance of our method is superior to other state-of-the arts methods. (C) 2017 Published by Elsevier B.V.

引用

页码：1 / 10

页数：10

共 50 条

[1] Speech Emotion Recognition Based on Transfer Emotion-Discriminative Features Subspace Learning
Zhang, Kexin
Liu, Yunxiang
[J]. IEEE ACCESS, 2023, 11 : 56336 - 56343
[2] DOMAIN-INVARIANT FEATURE LEARNING FOR CROSS CORPUS SPEECH EMOTION RECOGNITION
Gao, Yuan
Okada, Shogo
Wang, Longbiao
Liu, Jiaxing
Dang, Jianwu
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6427 - 6431
[3] Learning Domain-Invariant and Discriminative Features for Homogeneous Unsupervised Domain Adaptation
ZHANG Yun
WANG Nianbin
CAI Shaobin
[J]. Chinese Journal of Electronics, 2020, 29 (06) : 1119 - 1125
[4] Learning Domain-Invariant Discriminative Features for Heterogeneous Face Recognition
Yang, Shanmin
Fu, Keren
Yang, Xiao
Lin, Ye
Zhang, Jianwei
Peng, Cheng
[J]. IEEE ACCESS, 2020, 8 : 209790 - 209801
[5] Learning Class-Aligned and Generalized Domain-Invariant Representations for Speech Emotion Recognition
Xiao, Yufeng
Zhao, Huan
Li, Tingting
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2020, 4 (04): : 480 - 489
[6] Domain-Invariant Feature Learning for Domain Adaptation
Tu, Ching-Ting
Lin, Hsiau-Wen
Lin, Hwei Jen
Tokuyama, Yoshimasa
Chu, Chia-Hung
[J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (03)
[7] Speaker to Emotion: Domain Adaptation for Speech Emotion Recognition with Residual Adapters
Xi, Yuxuan
Li, Pengcheng
Song, Yan
Jiang, Yiheng
Dai, Lirong
[J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 513 - 518
[8] SUPERVISED DOMAIN ADAPTATION FOR EMOTION RECOGNITION FROM SPEECH
Abdelwahab, Mohammed
Busso, Carlos
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5058 - 5062
[9] Adversarial Domain Adaptation for Noisy Speech Emotion Recognition
Cho, Sunyoung
Yoon, Soosung
Song, Hyunseung
[J]. 2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1966 - 1970
[10] Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition
Lu, Cheng
Zong, Yuan
Zheng, Wenming
Li, Yang
Tang, Chuangao
Schuller, Bjoern W.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2217 - 2230

← 1 2 3 4 5 →