Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition

被引:0
|
作者
Latif, Siddique [1 ]
Qadir, Junaid [2 ]
Bilal, Muhammad [3 ]
机构
[1] Univ Southern Queensland, Toowoomba, Qld, Australia
[2] ITU, Lahore, Pakistan
[3] UWE, Bristol, Avon, England
关键词
Speech emotion recognition; Urdu language; Multi-lingual; generative adversarial networks (GANs); domain adaptation;
D O I
10.1109/acii.2019.8925513
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual speech emotion recognition (SER) is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Adversarial unsupervised domain adaptation for cross scenario waveform recognition
    Wang, Qing
    Du, Panfei
    Liu, Xiaofeng
    Yang, Jingyu
    Wang, Guohua
    [J]. SIGNAL PROCESSING, 2020, 171
  • [22] Unsupervised cross-lingual word embeddings learning with adversarial training
    Li, Yuling
    Zhang, Yuhong
    Li, Peipei
    Hu, Xuegang
    [J]. 2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019), 2019, : 150 - 156
  • [23] Generalised Unsupervised Domain Adaptation of Neural Machine Translation with Cross-Lingual Data Selection
    Thuy-Trang Vu
    He, Xuanli
    Dinh Phung
    Haffari, Gholamreza
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3335 - 3346
  • [24] Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition
    Deng, Jun
    Zhang, Zixing
    Eyben, Florian
    Schuller, Bjoern
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1068 - 1072
  • [25] COUPLED UNSUPERVISED DEEP CONVOLUTIONAL DOMAIN ADAPTATION FOR SPEECH EMOTION RECOGNITION
    Noi, Ocquaye Elias Nii
    Mao, Qirong
    Xu, Guopeng
    Xue, Yanfei
    [J]. 2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [26] Self Supervised Adversarial Domain Adaptation for Cross-Corpus and Cross-Language Speech Emotion Recognition
    Latif, Siddique
    Rana, Rajib
    Khalifa, Sara
    Jurdak, Raja
    Schuller, Bjorn
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 1912 - 1926
  • [27] IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS
    Le Minh Nguyen
    Nayak, Shekhar
    Coler, Matt
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 792 - 797
  • [28] Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
    Dines, John
    Liang, Hui
    Saheer, Lakshmi
    Gibson, Matthew
    Byrne, William
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    Hirsimaki, Teemu
    Karhila, Reima
    Kurimo, Mikko
    [J]. COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02): : 420 - 437
  • [29] PPT: Parsimonious Parser Transfer for Unsupervised Cross-Lingual Adaptation
    Kurniawan, Kemal
    Frermann, Lea
    Schulz, Philip
    Cohn, Trevor
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2907 - 2918
  • [30] Convolutional Auto-Encoder and Adversarial Domain Adaptation for Cross-Corpus Speech Emotion Recognition
    Wang, Yang
    Fu, Hongliang
    Tao, Huawei
    Yang, Jing
    Ge, Hongyi
    Xie, Yue
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (10) : 1803 - 1806