Speech emotion recognition approaches: A systematic review

被引:4
|
作者
Hashem, Ahlam [1 ]
Arif, Muhammad [1 ]
Alghamdi, Manal [1 ]
机构
[1] Umm Al Qura Univ, Dept Comp Sci, Al Abdiyah, Makkah, Saudi Arabia
关键词
Speech emotion recognition; Emotional speech database; Classification of emotion; Speech features; Systematic review; TIME-COURSE; NEURAL-NETWORK; FEATURES; SELECTION; DOMAIN; REPRESENTATIONS; CLASSIFICATION; CLASSIFIERS; INFORMATION; PERFORMANCE;
D O I
10.1016/j.specom.2023.102974
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The speech emotion recognition (SER) field has been active since it became a crucial feature in advanced Human-Computer Interaction (HCI), and wide real-life applications use it. In recent years, numerous SER systems have been covered by researchers, including the availability of appropriate emotional databases, selecting robustness features, and applying suitable classifiers using Machine Learning (ML) and Deep Learning (DL). Deep models proved to perform more accurately for SER than conventional ML techniques. Nevertheless, SER is yet challenging for classification where to separate similar emotional patterns; it needs a highly discriminative feature representation. For this purpose, this survey aims to critically analyze what is being done in this field of research in light of previous studies that aim to recognize emotions using speech audio in different aspects and review the current state of SER using DL. Through a systematic literature review whereby searching selected keywords from 2012-2022, 96 papers were extracted and covered the most current findings and directions. Specifically, we covered the database (acted, evoked, and natural) and features (prosodic, spectral, voice quality, and teager energy operator), the necessary preprocessing steps. Furthermore, different DL models and their performance are examined in depth. Based on our review, we also suggested SER aspects that could be considered in the future.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Speech Emotion Recognition
    Lalitha, S.
    Madhavan, Abhishek
    Bhushan, Bharath
    Saketh, Srinivas
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRONICS, COMPUTERS AND COMMUNICATIONS (ICAECC), 2014,
  • [22] Machine Learning Approaches for Speech Emotion Recognition: Classic and Novel Advances
    Heracleous, Panikos
    Ishikawa, Akio
    Yasuda, Keiji
    Kawashima, Hiroyuki
    Sugaya, Fumiaki
    Hashimoto, Masayuki
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2017, PT II, 2018, 10762 : 180 - 191
  • [23] Emotion recognition of audio/speech data using deep learning approaches
    Gupta, Vedika
    Juyal, Stuti
    Singh, Gurvinder Pal
    Killa, Chirag
    Gupta, Nishant
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06): : 1309 - 1317
  • [24] Human emotion recognition using intelligent approaches: A review
    Chowdary, M. Kalpana
    Hemanth, D. Jude
    [J]. INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2019, 13 (04): : 417 - 433
  • [25] Speech recognition in the radiology department: a systematic review
    Hammana, Imane
    Lepanto, Luigi
    Poder, Thomas
    Bellemare, Christian
    Ly, My-Sandra
    [J]. HEALTH INFORMATION MANAGEMENT JOURNAL, 2015, 44 (02) : 4 - 10
  • [26] Automatic Speech Recognition: Systematic Literature Review
    Alharbi, Sadeen
    Alrazgan, Muna
    Alrashed, Alanoud
    Alnomasi, Turkiayh
    Almojel, Raghad
    Alharbi, Rimah
    Alharbi, Saja
    Alturki, Sahar
    Alshehri, Fatimah
    Almojil, Maha
    [J]. IEEE ACCESS, 2021, 9 : 131858 - 131876
  • [27] Acoustic modeling in speech recognition: A systematic review
    Bhatt, Shobha
    Jain, Anurag
    Dev, Amita
    [J]. International Journal of Advanced Computer Science and Applications, 2020, 11 (04): : 397 - 412
  • [28] Acoustic Modeling in Speech Recognition: A Systematic Review
    Bhatt, Shobha
    Jain, Anurag
    Dev, Amita
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 397 - 412
  • [29] Speech Emotion Recognition Using Deep Learning Techniques: A Review
    Khalil, Ruhul Amin
    Jones, Edward
    Babar, Mohammad Inayatullah
    Jan, Tariqullah
    Zafar, Mohammad Haseeb
    Alhussain, Thamer
    [J]. IEEE ACCESS, 2019, 7 : 117327 - 117345
  • [30] Speech Emotion Recognition Systems: A Comprehensive Review on Different Methodologies
    Anthony, Audre Arlene
    Patil, Chandreshekar Mohan
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2023, 130 (01) : 515 - 525