Enhancing multilingual recognition of emotion in speech by language identification

被引:20
|
作者
Sagha, Hesam [1 ]
Matejka, Pavel [2 ,3 ,4 ]
Gavryukova, Maryna [1 ]
Povolny, Filip [2 ]
Marchi, Erik [1 ]
Schuller, Bjoern [1 ,5 ]
机构
[1] Univ Passau, Chair Complex & Intelligent Syst, Passau, Germany
[2] Phonexia Brno, Brno, Czech Republic
[3] Brno Univ Technol, Speech FIT, Brno, Czech Republic
[4] Brno Univ Technol, Ctr Excellence IT4I, Brno, Czech Republic
[5] Imperial Coll London, Dept Comp, London, England
基金
欧盟地平线“2020”;
关键词
multilingual emotion recognition; language identification; language families; FEATURES;
D O I
10.21437/Interspeech.2016-333
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We investigate, for the first time, if applying model selection based on automatic language identification (LID) can improve multilingual recognition of emotion in speech. Six emotional speech corpora from three language families (Germanic, Romance, Sino-Tibetan) are evaluated. The emotions are represented by the quadrants in the arousal/valence plane, i.e., positive/negative arousal/valence. Four selection approaches for choosing an optimal training set depending on the current language are compared: within the same language family, across language family, use of all available corpora, and selection based on the automatic LID. We found that, on average, the proposed LID approach for selecting training corpora is superior to using all the available corpora when the spoken language is not known.
引用
收藏
页码:2949 / 2953
页数:5
相关论文
共 50 条
  • [21] The Generalization Effect for Multilingual Speech Emotion Recognition across Heterogeneous Languages
    Lee, Shi-Wook
    [J]. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2019, 2019-May : 5881 - 5885
  • [22] Multilingual Speech Identification Framework (MSIF) A Novel Approach in Language Identification
    Sawalkar, Swapnil
    Roy, Pinki
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023, 2023, 14301 : 716 - 723
  • [23] Model Comparison in Speech Emotion Recognition for Indonesian Language
    Rumagit, Reinert Yosua
    Alexander, Glenn
    Saputra, Irfan Fahmi
    [J]. 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE 2020, 2021, 179 : 789 - 797
  • [24] Toward Language-Agnostic Speech Emotion Recognition
    Ntalampiras, Stavros
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2020, 68 (1-2): : 7 - 13
  • [25] BanglaSER: A speech emotion recognition dataset for the Bangla language
    Das, Rakesh Kumar
    Islam, Nahidul
    Ahmed, Md. Rayhan
    Islam, Salekul
    Shatabda, Swakkhar
    Islam, A. K. M. Muzahidul
    [J]. DATA IN BRIEF, 2022, 42
  • [26] Multilingual Speech Emotion Recognition System based on a Three-layer Model
    Li, Xingfeng
    Akagi, Masato
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3608 - 3612
  • [27] A Multilingual Framework Based on Pre-training Model for Speech Emotion Recognition
    Zhang, Zhaohang
    Zhang, Xiaohui
    Guo, Min
    Zhang, Wei-Qiang
    Li, Ke
    Huang, Yukai
    [J]. 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 750 - 755
  • [28] Multilingual, Cross-lingual, and Monolingual Speech Emotion Recognition on EmoFilm Dataset
    Atmaja, Bagus Tris
    Sasou, Akira
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1019 - 1025
  • [29] On Enhancing Speech Emotion Recognition using Generative Adversarial Networks
    Sahu, Saurabh
    Gupta, Rahul
    Espy-Wilson, Carol
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3693 - 3697
  • [30] Enhancing Emotion Recognition from Speech through Feature Selection
    Kostoulas, Theodoros
    Ganchev, Todor
    Lazaridis, Alexandros
    Fakotakis, Nikos
    [J]. TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 338 - 344