Multilingual, Cross-lingual, and Monolingual Speech Emotion Recognition on EmoFilm Dataset

被引:0
|
作者
Atmaja, Bagus Tris [1 ]
Sasou, Akira [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
speech emotion recognition; pre-trained model; multilingual emotion recognition; EmoFilm;
D O I
10.1109/APSIPAASC58517.2023.10317223
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research on speech emotion recognition has been actively conducted; most are in monolingual settings. Considering that emotion expressed in speech is universal, it is noteworthy to conduct multilingual emotion recognition across different cultures. This paper contributes to evaluating multilingual, cross-lingual, and monolingual automatic speech emotion recognition (SER) on the same EmoFilm dataset. We first evaluated these three scenarios on a fixed training/test split. For multilingual emotion recognition, we then expanded the evaluation with cross-validation. The results show that the multilingual SER gained the highest performance with 74.86% of balanced accuracy for five categorical emotions, followed by 72.17% and 58.03% for monolingual and cross-lingual evaluations. We reduced the number of training samples to observe its impact and found that the monolingual setting is superior among others on the same number of samples. The results of this study could suggest the potential use of multilingual SER over cross-lingual and monolingual SER in future speech technologies.
引用
收藏
页码:1019 / 1025
页数:7
相关论文
共 50 条
  • [1] CROSS-LINGUAL AND MULTILINGUAL SPEECH EMOTION RECOGNITION ON ENGLISH AND FRENCH
    Neumann, Michael
    Ngoc Thang Vu
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5769 - 5773
  • [2] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [3] Monolingual, multilingual and cross-lingual code comment classification
    Kostic, Marija
    Batanovic, Vuk
    Nikolic, Bosko
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [4] Reproducing Monolingual, Multilingual and Cross-Lingual CEFR Predictions
    Bestgen, Yves
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5595 - 5602
  • [5] Cross-lingual Speech Emotion Recognition through Factor Analysis
    Desplanques, Brecht
    Demuynck, Kris
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3648 - 3652
  • [6] Semi-supervised cross-lingual speech emotion recognition
    Agarla, Mirko
    Bianco, Simone
    Celona, Luigi
    Napoletano, Paolo
    Petrovsky, Alexey
    Piccoli, Flavio
    Schettini, Raimondo
    Shanin, Ivan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [7] UNSUPERVISED CROSS-LINGUAL SPEECH EMOTION RECOGNITION USING PSEUDO MULTILABEL
    Li, Fin
    Yan, Nan
    Wang, Lan
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 366 - 373
  • [8] Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
    Latif, Siddique
    Qadir, Junaid
    Bilal, Muhammad
    [J]. 2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [9] METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer
    Zhu, Xinfa
    Lei, Yi
    Li, Tao
    Zhang, Yongmao
    Zhou, Hongbin
    Lu, Heng
    Xie, Lei
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1506 - 1518
  • [10] Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
    Farooq, Muhammad Umar
    Hain, Thomas
    [J]. INTERSPEECH 2022, 2022, : 3849 - 3853