Automatic Speech Emotion Recognition: a Systematic Literature Review

被引:0
|
作者
Mustafa H.H. [2 ]
Darwish N.R. [2 ]
Hefny H.A. [1 ]
机构
[1] Computer Science Department, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza
[2] Information Systems and Technology Department, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza
关键词
Automatic speech recognition; Emotional speech; Speech emotion recognition; Speech recognition;
D O I
10.1007/s10772-024-10096-7
中图分类号
学科分类号
摘要
Automatic Speech Emotion Recognition (ASER) has recently garnered attention across various fields including artificial intelligence, pattern recognition, and human–computer interaction. However, ASER encounters numerous challenges such as a shortage of diverse datasets, appropriate feature selection, and suitable intelligent recognition techniques. To address these challenges, a systematic literature review (SLR) was conducted following established guidelines. A total of 60 primary research papers spanning from 2011 to 2023 were reviewed to investigate, interpret, and analyze the related literature by addressing five key research questions. Despite being an emerging area with applications in real-life scenarios, ASER still grapples with limitations in existing techniques. This SLR provides a comprehensive overview of existing techniques, datasets, and feature extraction tools in the ASER domain, shedding light on the weaknesses of current research studies. Additionally, it outlines a list of limitations for consideration in future work. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
引用
收藏
页码:267 / 285
页数:18
相关论文
共 50 条
  • [1] Automatic Speech Recognition: Systematic Literature Review
    Alharbi, Sadeen
    Alrazgan, Muna
    Alrashed, Alanoud
    Alnomasi, Turkiayh
    Almojel, Raghad
    Alharbi, Rimah
    Alharbi, Saja
    Alturki, Sahar
    Alshehri, Fatimah
    Almojil, Maha
    [J]. IEEE ACCESS, 2021, 9 : 131858 - 131876
  • [2] A systematic literature review of speech emotion recognition approaches
    Singh, Youddha Beer
    Goel, Shivani
    [J]. NEUROCOMPUTING, 2022, 492 : 245 - 263
  • [3] Urdu Speech Emotion Recognition: A Systematic Literature Review
    Taj, Soonh
    Mujtaba, Ghulam
    Daudpota, Sher Muhammad
    Mughal, Muhammad Hussain
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (07)
  • [4] Arabic Automatic Speech Recognition: A Systematic Literature Review
    Dhouib, Amira
    Othman, Achraf
    El Ghoul, Oussama
    Khribi, Mohamed Koutheair
    Al Sinani, Aisha
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [5] Automatic Emotion Recognition in Children with Autism: A Systematic Literature Review
    Landowska, Agnieszka
    Karpus, Aleksandra
    Zawadzka, Teresa
    Robins, Ben
    Erol Barkana, Duygun
    Kose, Hatice
    Zorcec, Tatjana
    Cummins, Nicholas
    [J]. SENSORS, 2022, 22 (04)
  • [6] Automatic Speech Recognition (ASR) Systems for Children: A Systematic Literature Review
    Bhardwaj, Vivek
    Ben Othman, Mohamed Tahar
    Kukreja, Vinay
    Belkhier, Youcef
    Bajaj, Mohit
    Goud, B. Srikanth
    Rehman, Ateeq Ur
    Shafiq, Muhammad
    Hamam, Habib
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [7] Speech Emotion Recognition Methods: A Literature Review
    Basharirad, Babak
    Moradhaseli, Mohammadreza
    [J]. 2ND INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND TECHNOLOGY 2017 (ICAST'17), 2017, 1891
  • [8] Speech emotion recognition approaches: A systematic review
    Hashem, Ahlam
    Arif, Muhammad
    Alghamdi, Manal
    [J]. SPEECH COMMUNICATION, 2023, 154
  • [9] Automatic Speech Emotion Recognition: A Survey
    Chandrasekar, Purnima
    Chapaneri, Santosh
    Jayaswal, Deepak
    [J]. 2014 INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS, COMMUNICATION AND INFORMATION TECHNOLOGY APPLICATIONS (CSCITA), 2014, : 341 - 346
  • [10] Automatic emotion recognition by the speech signal
    Schuller, B
    Lang, M
    Rigoll, G
    [J]. 6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IX, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING II, 2002, : 367 - 372