Emotional Speech Recognition: A Multilingual Perspective

被引:0
|
作者
Meftah, Ali [1 ]
Alotaibi, Yousef [1 ]
Selouani, Sid-Ahmed [2 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Riyadh, Saudi Arabia
[2] Univ Moncton, 218 Blvd, Shippegan, NB E8S 1P6, Canada
关键词
Speech; emotion; Arabic; English; recognition; ANN; FEATURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a comparison and analysis of speech emotion recognition in the context of Arabic and English languages. Four emotions (neutral, sadness, happiness and anger) were considered from two speech corpora: the King Saud University Emotions (KSUEmotions) corpus for Arabic and the Emotional Prosody Speech and Transcripts (EPST) corpus for English. Six speakers (three men and three women) were selected from each corpus. Many acoustic features were extracted for use in the recognition and analysis stages. Additionally, an Analysis Of Variance (ANOVA) was used to determine which acoustic features should be used in our emotion recognition system. Results show that there is a benefit in terms of emotion recognition for Arabic words with the use of specific acoustic features. Results also show that certain speech features, such as the first three formants, help in the accuracy of emotion recognition.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] REGION DEPENDENT LINEAR TRANSFORMS IN MULTILINGUAL SPEECH RECOGNITION
    Karafiat, Martin
    Janda, Milos
    Cernocky, Jan
    Burget, Lukas
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4885 - 4888
  • [32] PSEUDO-LABELING FOR MASSIVELY MULTILINGUAL SPEECH RECOGNITION
    Lugosch, Loren
    Likhomanenko, Tatiana
    Synnaeve, Gabriel
    Collobert, Ronan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7687 - 7691
  • [33] A scalable architecture for multilingual speech recognition on embedded devices
    Raab, Martin
    Gruhn, Rainer
    Noeth, Elmar
    [J]. SPEECH COMMUNICATION, 2011, 53 (01) : 62 - 74
  • [34] Prediction of emotional dimensions PAD for emotional speech recognition
    Sun Y.
    Hu Y.-X.
    Zhang X.-Y.
    Duan S.-F.
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2019, 53 (10): : 2041 - 2048
  • [35] NEURAL CODES TO FACTOR LANGUAGE IN MULTILINGUAL SPEECH RECOGNITION
    Mueller, Markus
    Stueker, Sebastian
    Waibel, Alex
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8638 - 8642
  • [36] Online Generation of Acoustic Models for Multilingual Speech Recognition
    Raab, Martin
    Aradilla, Guillermo
    Gruhn, Rainer
    Noeth, Elmar
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2979 - +
  • [37] Hate speech recognition in multilingual text: hinglish documents
    Yadav A.K.
    Kumar M.
    Kumar A.
    Shivani
    Kusum
    Yadav D.
    [J]. International Journal of Information Technology, 2023, 15 (3) : 1319 - 1331
  • [38] A unified system for multilingual speech recognition and language identification
    Liu, Danyang
    Xu, Ji
    Zhang, Pengyuan
    Yan, Yonghong
    [J]. SPEECH COMMUNICATION, 2021, 127 : 17 - 28
  • [39] Enhancing multilingual recognition of emotion in speech by language identification
    Sagha, Hesam
    Matejka, Pavel
    Gavryukova, Maryna
    Povolny, Filip
    Marchi, Erik
    Schuller, Bjoern
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2949 - 2953
  • [40] REGION DEPENDENT LINEAR TRANSFORMS IN MULTILINGUAL SPEECH RECOGNITION
    Karafiat, Martin
    Janda, Milos
    Cernocky, Jan
    Burget, Lukas
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4885 - 4888