Deep Convolutional Neural Network and Gray Wolf Optimization Algorithm for Speech Emotion Recognition

被引:0
|
作者
Mohammad Reza Falahzadeh
Fardad Farokhi
Ali Harimi
Reza Sabbaghi-Nadooshan
机构
[1] Islamic Azad University,Department of Electrical Engineering, Central Tehran Branch
[2] Islamic Azad University,Department of Biomedical Engineering, Central Tehran Branch
[3] Islamic Azad University,Department of Electrical Engineering, Shahrood Branch
关键词
Speech emotion recognition; 3D tensor speech representation; Chaogram; Deep convolutional neural network; Gray wolf optimization algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Speech emotion recognition (SER), an important method of emotional human–machine interaction, has been the focus of much research in recent years. Motivated by powerful Deep Convolutional Neural Networks (DCNNs) to learn features and the landmark success of these networks in the field of image classification, the present study aimed to prepare a pre-trained DCNN model for SER and provide compatible input to these networks by converting a speech signal into a 3D tensor. First, using a reconstructed phase space, speech samples are reconstructed in a 3D phase space. Studies have shown that the patterns formed in this space contain meaningful emotional features of the speaker. To provide an input that is compatible with DCNN, a new speech signal representation called Chaogram was introduced as the projection of these patterns, and three channels similar to RGB images were obtained. In the next step, image enhancement techniques were used to highlight the details of Chaogram images. Then, the Visual Geometry Group (VGG) DCNN pre-trained on the large ImageNet dataset is utilized to learn Chaogram high-level features and corresponding emotion classes. Finally, transfer learning is performed on the proposed model, and the presented model is fine-tuned on our datasets. To optimize the hyper-parameter arrangement of architecture-determined CNNs, an innovative DCNN-GWO (gray wolf optimization) is also presented. The results of this study on two public datasets of emotions, i.e., EMO-DB and eNTERFACE05, show the promising performance of the proposed model, which can greatly improve SER applications.
引用
收藏
页码:449 / 492
页数:43
相关论文
共 50 条
  • [21] Deep Convolutional Neural Networks for Feature Extraction in Speech Emotion Recognition
    Heracleous, Panikos
    Mohammad, Yasser
    Yoneyama, Akio
    [J]. HUMAN-COMPUTER INTERACTION. RECOGNITION AND INTERACTION TECHNOLOGIES, HCI 2019, PT II, 2019, 11567 : 117 - 132
  • [22] Improvement on Speech Emotion Recognition Based on Deep Convolutional Neural Networks
    Niu, Yafeng
    Zou, Dongsheng
    Niu, Yadong
    He, Zhongshi
    Tan, Hua
    [J]. PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON COMPUTING AND ARTIFICIAL INTELLIGENCE (ICCAI 2018), 2018, : 13 - 18
  • [23] Recognition of emotion in music based on deep convolutional neural network
    Rajib Sarkar
    Sombuddha Choudhury
    Saikat Dutta
    Aneek Roy
    Sanjoy Kumar Saha
    [J]. Multimedia Tools and Applications, 2020, 79 : 765 - 783
  • [24] Recognition of emotion in music based on deep convolutional neural network
    Sarkar, Rajib
    Choudhury, Sombuddha
    Dutta, Saikat
    Roy, Aneek
    Saha, Sanjoy Kumar
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (1-2) : 765 - 783
  • [25] Facial Emotion Recognition Using Deep Convolutional Neural Network
    Pranav, E.
    Kamal, Suraj
    Chandran, Satheesh C.
    Supriya, M. H.
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 317 - 320
  • [26] Convolutional Neural Network Hyperparameters Optimization for Facial Emotion Recognition
    Vulpe-Grigorasi, Adrian
    Grigore, Ovidiu
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON ADVANCED TOPICS IN ELECTRICAL ENGINEERING (ATEE), 2021,
  • [27] Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network
    Zisad, Sharif Noor
    Hossain, Mohammad Shahadat
    Andersson, Karl
    [J]. BRAIN INFORMATICS, BI 2020, 2020, 12241 : 287 - 296
  • [28] Optimizing Speech Emotion Recognition with Hilbert Curve and convolutional neural network
    Yang, Zijun
    Zhou, Shi
    Zhang, Lifeng
    Serikawa, Seiichi
    [J]. Cognitive Robotics, 2024, 4 : 30 - 41
  • [29] Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network
    Alluhaidan, Ala Saleh
    Saidani, Oumaima
    Jahangir, Rashid
    Nauman, Muhammad Asif
    Neffati, Omnia Saidani
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (08):
  • [30] Constructing Speech Emotion Recognition Model Based on Convolutional Neural Network
    Kuo, Jong-Yih
    Chen, Zhao-Ming
    Lin, Hui-Chi
    [J]. 2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW 2021), 2021, : 52 - 56