Survey of Deep Representation Learning for Speech Emotion Recognition

被引:45
|
作者
Latif, Siddique [1 ,2 ]
Rana, Rajib [1 ]
Khalifa, Sara [3 ,4 ,5 ]
Jurdak, Raja
Qadir, Junaid [6 ]
Schuller, Bjorn [7 ,8 ]
机构
[1] Univ Southern Queensland USQ, Springfield, Qld 4300, Australia
[2] Data61 CSIRO, Distributed Sensing Syst Grp, Pullenvale, Qld 4069, Australia
[3] Data61 CSIRO, Distributed Sensing Syst Grp, Pullenvale, Qld 4069, Australia
[4] Univ New South Wales, Sydney, NSW 2052, Australia
[5] Univ Queensland, St Lucia, Qld 4072, Australia
[6] Qatar Univ, Coll Engn, Dept Comp Sci & Engn, Doha, Qatar
[7] Imperial Coll London, Grp Language Audio & Mus, London SW7 2BX, England
[8] Univ Augsburg, Embedded Intelligence Hlth Care & Wellbeing, D-86159 Augsburg, Germany
关键词
Speech emotion recognition; multi task learning; representation learning; domain adaptation; unsupervised learning; COMPONENT ANALYSIS; LADDER NETWORKS; FEATURES; CORPUS; ADVERSARIAL; DIMENSIONALITY; ARCHITECTURES; CLASSIFIERS; ALGORITHM; DATABASES;
D O I
10.1109/TAFFC.2021.3114365
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditionally, speech emotion recognition (SER) research has relied on manually handcrafted acoustic features using feature engineering. However, the design of handcrafted features for complex SER tasks requires significant manual effort, which impedes generalisability and slows the pace of innovation. This has motivated the adoption of representation learning techniques that can automatically learn an intermediate representation of the input signal without any manual feature engineering. Representation learning has led to improved SER performance and enabled rapid innovation. Its effectiveness has further increased with advances in deep learning (DL), which has facilitated deep representation learning where hierarchical representations are automatically learned in a data-driven manner. This article presents the first comprehensive survey on the important topic of deep representation learning for SER. We highlight various techniques, related challenges and identify important future areas of research. Our survey bridges the gap in the literature since existing surveys either focus on SER with hand-engineered features or representation learning in the general setting without focusing on SER.
引用
收藏
页码:1634 / 1654
页数:21
相关论文
共 50 条
  • [41] Urdu Speech Emotion Recognition using Speech Spectral Features and Deep Learning Techniques
    Taj, Soonh
    Shaikh, Ghulam Mujtaba
    Hassan, Saif
    Nimra
    [J]. 2023 4th International Conference on Computing, Mathematics and Engineering Technologies: Sustainable Technologies for Socio-Economic Development, iCoMET 2023, 2023,
  • [42] Hybrid deep learning models based emotion recognition with speech signals
    Chowdary, M. Kalpana
    Priya, E. Anu
    Danciulescu, Daniela
    Anitha, J.
    Hemanth, D. Jude
    [J]. INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2023, 17 (04): : 1435 - 1453
  • [43] An Emotion Recognition Method Using Speech Signals Based on Deep Learning
    Byun, Sung-woo
    Shin, Bo-ra
    Lee, Seok-Pil
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 124 : 181 - 182
  • [44] Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms
    Satt, Aharon
    Rozenberg, Shai
    Hoory, Ron
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1089 - 1093
  • [45] Speech Emotion Recognition Using Deep Learning LSTM for Tamil Language
    Fernandes, Bennilo
    Mannepalli, Kasiprasad
    [J]. PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY, 2021, 29 (03): : 1915 - 1936
  • [46] Deep Learning Based Human Emotion Recognition from Speech Signal
    Queen, M. P. Flower
    Sankar, S. Perumal
    Aurtherson, P. Babu
    Jeyakumar, P.
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (06): : 119 - 124
  • [47] Student's Feedback by emotion and speech recognition through Deep Learning
    Jain, Ati
    Sah, Hare Ram
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 442 - 447
  • [48] Deep Learning Algorithms for Speech Emotion Recognition with Hybrid Spectral Features
    Kogila R.
    Sadanandam M.
    Bhukya H.
    [J]. SN Computer Science, 5 (1)
  • [49] Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models
    Abbaschian, Babak Joze
    Sierra-Sosa, Daniel
    Elmaghraby, Adel
    [J]. SENSORS, 2021, 21 (04) : 1 - 27
  • [50] Emotion recognition of audio/speech data using deep learning approaches
    Gupta, Vedika
    Juyal, Stuti
    Singh, Gurvinder Pal
    Killa, Chirag
    Gupta, Nishant
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06): : 1309 - 1317