Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition

被引:0
|
作者
Latif, Siddique [1 ]
Qadir, Junaid [2 ]
Bilal, Muhammad [3 ]
机构
[1] Univ Southern Queensland, Toowoomba, Qld, Australia
[2] ITU, Lahore, Pakistan
[3] UWE, Bristol, Avon, England
关键词
Speech emotion recognition; Urdu language; Multi-lingual; generative adversarial networks (GANs); domain adaptation;
D O I
10.1109/acii.2019.8925513
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual speech emotion recognition (SER) is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Unsupervised Cross-Lingual Speech Emotion Recognition Using Domain Adversarial Neural Network
    Cai, Xiong
    Wu, Zhiyong
    Zhong, Kuo
    Su, Bin
    Dai, Dongyang
    Meng, Helen
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [2] UNSUPERVISED CROSS-LINGUAL SPEECH EMOTION RECOGNITION USING PSEUDO MULTILABEL
    Li, Fin
    Yan, Nan
    Wang, Lan
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 366 - 373
  • [3] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [4] Unsupervised Cross-lingual Representation Learning for Speech Recognition
    Conneau, Alexis
    Baevski, Alexei
    Collobert, Ronan
    Mohamed, Abdelrahman
    Auli, Michael
    [J]. INTERSPEECH 2021, 2021, : 2426 - 2430
  • [5] Cross-Lingual Adversarial Domain Adaptation for Novice Programming
    Mao, Ye
    Khoshnevisan, Farzaneh
    Price, Thomas
    Barnes, Tiffany
    Chi, Min
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7682 - 7690
  • [6] Single- and Cross-Lingual Speech Emotion Recognition Based on WavLM Domain Emotion Embedding
    Yang, Jichen
    Liu, Jiahao
    Huang, Kai
    Xia, Jiaqi
    Zhu, Zhengyu
    Zhang, Han
    [J]. ELECTRONICS, 2024, 13 (07)
  • [7] Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition
    Zhan, Qingran
    Xie, Xiang
    Hu, Chenguang
    Zuluaga-Gomez, Juan
    Wang, Jing
    Cheng, Haobo
    [J]. ELECTRONICS, 2021, 10 (24)
  • [8] Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model
    Li, Juntao
    He, Ruidan
    Ye, Hai
    Ng, Hwee Tou
    Bing, Lidong
    Yan, Rui
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3672 - 3678
  • [9] Cross-lingual Speech Emotion Recognition through Factor Analysis
    Desplanques, Brecht
    Demuynck, Kris
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3648 - 3652
  • [10] CROSS-LINGUAL AND MULTILINGUAL SPEECH EMOTION RECOGNITION ON ENGLISH AND FRENCH
    Neumann, Michael
    Ngoc Thang Vu
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5769 - 5773