Speech Emotion Recognition using Convolution Neural Networks and Deep Stride Convolutional Neural Networks

被引:16
|
作者
Wani, Taiba Majid [1 ]
Gunawan, Teddy Surya [2 ,3 ]
Qadri, Syed Asif Ahmad [1 ]
Mansor, Hasmah [1 ]
Kartiwi, Mira [4 ]
Ismail, Nanang [5 ]
机构
[1] Int Islamic Univ Malaysia, Elect & Comp Eng Dept, Kuala Lumpur, Malaysia
[2] IIUM, ECE Dept, Kuala Lumpur, Malaysia
[3] Univ Potensi Utama, FTIK, Medan City, Indonesia
[4] Int Islamic Univ Malaysia, Informat Syst Dept, Kuala Lumpur, Malaysia
[5] UIN Sunan Gunung Djati, Dept Elect Engn, Bandung, Indonesia
关键词
speech emotion recognition; spectrogram; strides; convolutional neural network (CNN); deep stride convolutional neural network (DSCNN);
D O I
10.1109/icwt50448.2020.9243622
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
An assortment of techniques has been presented in the area of Speech Emotion Recognition (SER), where the main focus is to recognize the silent discriminants and useful features of speech signals. These features undergo the process of classification to recognize the specific emotion of a speaker. In recent times, deep learning techniques have emerged as a breakthrough in speech emotion recognition to detect and classify emotions. In this paper, we have modified a recently developed different network architecture of convolutional neural networks, i.e., Deep Stride Convolutional Neural Networks (DSCNN), by taking a smaller number of convolutional layers to increase the computational speed while still maintaining accuracy. Besides, we trained the state-of-art model of CNN and proposed DSCNN on spectrograms generated from the SAVEE speech emotion dataset. For the evaluation process, four emotions angry, happy, neutral, and sad, were considered. Evaluation results show that the proposed architecture DSCNN, with the prediction accuracy of 87.8%, outperforms CNN with 79.4% accuracy.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Convolutional Neural Networks for Distant Speech Recognition
    Swietojanski, Pawel
    Ghoshal, Arnab
    Renals, Steve
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1120 - 1124
  • [42] Speech Recognition Based on Convolutional Neural Networks
    Du Guiming
    Wang Xia
    Wang Guangyan
    Zhang Yan
    Li Dan
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2016, : 708 - 711
  • [43] AN ANALYSIS OF CONVOLUTIONAL NEURAL NETWORKS FOR SPEECH RECOGNITION
    Huang, Jui-Ting
    Li, Jinyu
    Gong, Yifan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4989 - 4993
  • [44] Emotional Speech Recognition Using Deep Neural Networks
    Trinh Van, Loan
    Dao Thi Le, Thuy
    Le Xuan, Thanh
    Castelli, Eric
    [J]. SENSORS, 2022, 22 (04)
  • [45] Speech Emotion Recognition using MFCC and Hybrid Neural Networks
    Badr, Youakim
    Mukherjee, Partha
    Thumati, Sindhu
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, : 366 - 373
  • [46] An Effective Speech Emotion Recognition Using Artificial Neural Networks
    Anoop, V.
    Rao, P. V.
    Aruna, S.
    [J]. INTERNATIONAL PROCEEDINGS ON ADVANCES IN SOFT COMPUTING, INTELLIGENT SYSTEMS AND APPLICATIONS, ASISA 2016, 2018, 628 : 393 - 401
  • [47] Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition
    Qian, Yanmin
    Bi, Mengxiao
    Tan, Tian
    Yu, Kai
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2263 - 2276
  • [48] Maxout neurons for deep convolutional and LSTM neural networks in speech recognition
    Cai, Meng
    Liu, Jia
    [J]. SPEECH COMMUNICATION, 2016, 77 : 53 - 64
  • [49] Factored deep convolutional neural networks for noise robust speech recognition
    Fujimoto, Masakiyo
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3837 - 3841
  • [50] Multiple attention convolutional-recurrent neural networks for speech emotion recognition
    Zhang, Zhihao
    Wang, Kunxia
    [J]. 2022 10TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2022,