Unsupervised Cross-Lingual Speech Emotion Recognition Using Domain Adversarial Neural Network

被引:4
|
作者
Cai, Xiong [1 ]
Wu, Zhiyong [1 ,2 ]
Zhong, Kuo [1 ]
Su, Bin [1 ]
Dai, Dongyang [1 ]
Meng, Helen [1 ,2 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Tsinghua CUHK Joint Res Ctr Media Sci Technol & S, Shenzhen, Peoples R China
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
speech emotion recognition; domain adversarial learning; cross-lingual; affective representation learning;
D O I
10.1109/ISCSLP49672.2021.9362058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
By using deep learning approaches, Speech Emotion Recognition (SER) on a single domain has achieved many excellent results. However, cross-domain SER is still a challenging task due to the distribution shift between source and target domains. In this work, we propose a Domain Adversarial Neural Network (DANN) based approach to mitigate this distribution shift problem for cross-lingual SER. Specifically, we add a language classifier and gradient reversal layer after the feature extractor to force the learned representation both language-independent and emotion-meaningful. Our method is unsupervised, i. e., labels on target language are not required, which makes it easier to apply our method to other languages. Experimental results show the proposed method provides an average absolute improvement of 3.91% over the baseline system for arousal and valence classification task. Furthermore, we find that batch normalization is beneficial to the performance gain of DANN. Therefore we also explore the effect of different ways of data combination for batch normalization.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
    Latif, Siddique
    Qadir, Junaid
    Bilal, Muhammad
    [J]. 2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [2] UNSUPERVISED CROSS-LINGUAL SPEECH EMOTION RECOGNITION USING PSEUDO MULTILABEL
    Li, Fin
    Yan, Nan
    Wang, Lan
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 366 - 373
  • [3] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [4] Unsupervised Cross-lingual Representation Learning for Speech Recognition
    Conneau, Alexis
    Baevski, Alexei
    Collobert, Ronan
    Mohamed, Abdelrahman
    Auli, Michael
    [J]. INTERSPEECH 2021, 2021, : 2426 - 2430
  • [5] Single- and Cross-Lingual Speech Emotion Recognition Based on WavLM Domain Emotion Embedding
    Yang, Jichen
    Liu, Jiahao
    Huang, Kai
    Xia, Jiaqi
    Zhu, Zhengyu
    Zhang, Han
    [J]. ELECTRONICS, 2024, 13 (07)
  • [6] Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition
    Zhan, Qingran
    Xie, Xiang
    Hu, Chenguang
    Zuluaga-Gomez, Juan
    Wang, Jing
    Cheng, Haobo
    [J]. ELECTRONICS, 2021, 10 (24)
  • [7] Cross-lingual Speech Emotion Recognition through Factor Analysis
    Desplanques, Brecht
    Demuynck, Kris
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3648 - 3652
  • [8] CROSS-LINGUAL AND MULTILINGUAL SPEECH EMOTION RECOGNITION ON ENGLISH AND FRENCH
    Neumann, Michael
    Ngoc Thang Vu
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5769 - 5773
  • [9] Semi-supervised cross-lingual speech emotion recognition
    Agarla, Mirko
    Bianco, Simone
    Celona, Luigi
    Napoletano, Paolo
    Petrovsky, Alexey
    Piccoli, Flavio
    Schettini, Raimondo
    Shanin, Ivan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [10] Multilingual, Cross-lingual, and Monolingual Speech Emotion Recognition on EmoFilm Dataset
    Atmaja, Bagus Tris
    Sasou, Akira
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1019 - 1025