Speech Emotion Recognition Using Deep Learning

被引:0
|
作者
Alagusundari, N. [1 ]
Anuradha, R. [1 ]
机构
[1] Sri Ramakrishna Engn Coll Coimbatore, Dept Comp Sci & Engn, Coimbatore, India
关键词
Deep learning; SER (speech emotion recognition); TCN (temporal convolutional network); CNN (convolutional neural network); GRU (gated recurrent unit); DANN (domain adversarial neural network); MFCC (mel-frequency cepstral coefficients);
D O I
10.1007/978-981-99-8476-3_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech emotion recognition can be used in many applications, mainly in the field of mental health and human-robot interaction. SER can be used to monitor anxiety, depression, and post-traumatic stress disorder, among other mental health disorders. In this work, we have developed deep learning models such as CNN, DANN, and TCN to recognize emotional states from speech signals. Each model is trained with different datasets with different feature extraction techniques such as MFCC, etc., to recognize various emotions. The emotional states of a person can be classified based on factors like pitch, tone, intensity, and dimensions of emotion such as arousal and valence. We have used four different datasets for training and evaluating the model. This work used CNN, GRU, DANN, and TCN with various feature extraction techniques, among that TCN performs better in large datasets (MFCC 58 features) with 93.66% accuracy and with eight emotion classes (Angry, Calm, Disgust, Fear, Happy, Neutral, Sad, Surprise).
引用
收藏
页码:313 / 325
页数:13
相关论文
共 50 条
  • [1] Speech Emotion Recognition Using Deep Learning
    Ahmed, Waqar
    Riaz, Sana
    Iftikhar, Khunsa
    Konur, Savas
    [J]. ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197
  • [2] Speech Emotion Recognition with Deep Learning
    Harar, Pavol
    Burget, Radim
    Dutta, Malay Kishore
    [J]. 2017 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2017, : 137 - 140
  • [3] Speech Emotion Recognition Using Deep Learning Techniques: A Review
    Khalil, Ruhul Amin
    Jones, Edward
    Babar, Mohammad Inayatullah
    Jan, Tariqullah
    Zafar, Mohammad Haseeb
    Alhussain, Thamer
    [J]. IEEE ACCESS, 2019, 7 : 117327 - 117345
  • [4] Emotion recognition from speech using deep learning on spectrograms
    Li, Xingguang
    Song, Wenjun
    Liang, Zonglin
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (03) : 2791 - 2796
  • [5] Speech Emotion Recognition Using Deep Learning on audio recordings
    Suganya, S.
    Charles, E. Y. A.
    [J]. 2019 19TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER - 2019), 2019,
  • [6] Emotion Recognition in Speech with Deep Learning Architectures
    Erdal, Mehmet
    Kaechele, Markus
    Schwenker, Friedhelm
    [J]. ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, 2016, 9896 : 298 - 311
  • [7] Speech Emotion Recognition Using Deep Learning LSTM for Tamil Language
    Fernandes, Bennilo
    Mannepalli, Kasiprasad
    [J]. PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY, 2021, 29 (03): : 1915 - 1936
  • [8] Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms
    Satt, Aharon
    Rozenberg, Shai
    Hoory, Ron
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1089 - 1093
  • [9] An Emotion Recognition Method Using Speech Signals Based on Deep Learning
    Byun, Sung-woo
    Shin, Bo-ra
    Lee, Seok-Pil
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 124 : 181 - 182
  • [10] Emotion recognition of audio/speech data using deep learning approaches
    Gupta, Vedika
    Juyal, Stuti
    Singh, Gurvinder Pal
    Killa, Chirag
    Gupta, Nishant
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06): : 1309 - 1317