A novel convolutional neural network with gated recurrent unit for automated speech emotion recognition and classification

被引:11
|
作者
Prakash, P. Ravi [1 ]
Anuradha, D. [2 ]
Iqbal, Javid [3 ]
Galety, Mohammad Gouse [4 ]
Singh, Ruby [5 ]
Neelakandan, S. [6 ]
机构
[1] Prasad V Potluri Siddhartha Inst Technol, Dept IT, Vijayawada, India
[2] Panimalar Engn Coll, Dept Comp Sci & Business Syst, Chennai, Tamil Nadu, India
[3] UCSI Univ, Inst Comp Sci & Digital Technol ICSDI, Kuala Lumpur, Malaysia
[4] Catholic Univ Erbil, Coll IT & CS, Dept Informat Technol, Erbil, Iraq
[5] SRM Inst Sci & Technol, Dept CSE, Ghaziabad, Uttar Pradesh, India
[6] RMK Engn Coll, Dept CSE, Sriperumbudur, India
关键词
Emotion recognition; speech recognition; deep learning; classification model; Berlin emotion dataset; DOMAIN-ADVERSARIAL; FEATURES; MODELS;
D O I
10.1080/23307706.2022.2085198
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automated Speech Emotion Recognition (SER) becomes more popular and has increased applicability. SER concentrates on the automatic identification of the emotional state of a human being using speech signals. It mainly depends upon the in-depth analysis of the speech signal, extracts features containing emotional details from the speech signal, and utilises pattern recognition techniques for emotional state identification. The major problem in automatic SER is to extract discriminate, powerful, and emotional salient features from the acoustical content of speech signals. The proposed model aims to detect and classify three emotional states of speech such as happy, neutral, and sad. The presented model makes use of Convolution neural network - Gated Recurrent unit (CNN-GRU) based feature extraction technique which derives a set of feature vectors. A comprehensive simulation takes place using the Berlin German Database and SJTU Chinese Database which comprises numerous audio files under a collection of different emotion labels.
引用
收藏
页码:54 / 63
页数:10
相关论文
共 50 条
  • [1] Speech Emotion Recognition Using Deep Convolutional Neural Network and Simple Recurrent Unit
    Jiang, Pengxu
    Fu, Hongliang
    Tao, Huawei
    [J]. ENGINEERING LETTERS, 2019, 27 (04) : 901 - 906
  • [2] Speech Emotion Recognition Based on Dual-Channel Convolutional Gated Recurrent Network
    Sun, Hanyu
    Huang, Lixia
    Zhang, Xueying
    Li, Juan
    [J]. Computer Engineering and Applications, 1600, 2 (170-177):
  • [3] Speech Emotion Recognition Based on a Recurrent Neural Network Classification Model
    Fonnegra, Ruben D.
    Diaz, Gloria M.
    [J]. ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY, ACE 2017, 2018, 10714 : 882 - 892
  • [4] Parallelized Convolutional Recurrent Neural Network With Spectral Features for Speech Emotion Recognition
    Jiang, Pengxu
    Fu, Hongliang
    Tao, Huawei
    Lei, Peizhi
    Zhao, Li
    [J]. IEEE ACCESS, 2019, 7 : 90368 - 90377
  • [5] Multimodal speech emotion recognition and classification using convolutional neural network techniques
    A. Christy
    S. Vaithyasubramanian
    A. Jesudoss
    M. D. Anto Praveena
    [J]. International Journal of Speech Technology, 2020, 23 : 381 - 388
  • [6] Multimodal speech emotion recognition and classification using convolutional neural network techniques
    Christy, A.
    Vaithyasubramanian, S.
    Jesudoss, A.
    Praveena, M. D. Anto
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 381 - 388
  • [7] Design of a Convolutional Neural Network for Speech Emotion Recognition
    Lee, Kyong Hee
    Kim, Do Hyun
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1332 - 1335
  • [8] CONVOLUTIONAL NEURAL NETWORK TECHNIQUES FOR SPEECH EMOTION RECOGNITION
    Parthasarathy, Srinivas
    Tashev, Ivan
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 121 - 125
  • [9] DEEP CONVOLUTIONAL RECURRENT NEURAL NETWORK WITH ATTENTION MECHANISM FOR ROBUST SPEECH EMOTION RECOGNITION
    Huang, Che-Wei
    Narayanan, Shrikanth
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 583 - 588
  • [10] 3D Convolutional Recurrent Global Neural Network for Speech Emotion Recognition
    Zayene, Baraa
    Jlassi, Chiraz
    Arous, Najet
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP'2020), 2020,