Deep Convolutional Neural Network and Gray Wolf Optimization Algorithm for Speech Emotion Recognition

被引：0

作者：

Mohammad Reza Falahzadeh

Fardad Farokhi

Ali Harimi

Reza Sabbaghi-Nadooshan

机构：

[1] Islamic Azad University,Department of Electrical Engineering, Central Tehran Branch

[2] Islamic Azad University,Department of Biomedical Engineering, Central Tehran Branch

[3] Islamic Azad University,Department of Electrical Engineering, Shahrood Branch

来源：

Circuits, Systems, and Signal Processing | 2023年 / 42卷

关键词：

Speech emotion recognition; 3D tensor speech representation; Chaogram; Deep convolutional neural network; Gray wolf optimization algorithm;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Speech emotion recognition (SER), an important method of emotional human–machine interaction, has been the focus of much research in recent years. Motivated by powerful Deep Convolutional Neural Networks (DCNNs) to learn features and the landmark success of these networks in the field of image classification, the present study aimed to prepare a pre-trained DCNN model for SER and provide compatible input to these networks by converting a speech signal into a 3D tensor. First, using a reconstructed phase space, speech samples are reconstructed in a 3D phase space. Studies have shown that the patterns formed in this space contain meaningful emotional features of the speaker. To provide an input that is compatible with DCNN, a new speech signal representation called Chaogram was introduced as the projection of these patterns, and three channels similar to RGB images were obtained. In the next step, image enhancement techniques were used to highlight the details of Chaogram images. Then, the Visual Geometry Group (VGG) DCNN pre-trained on the large ImageNet dataset is utilized to learn Chaogram high-level features and corresponding emotion classes. Finally, transfer learning is performed on the proposed model, and the presented model is fine-tuned on our datasets. To optimize the hyper-parameter arrangement of architecture-determined CNNs, an innovative DCNN-GWO (gray wolf optimization) is also presented. The results of this study on two public datasets of emotions, i.e., EMO-DB and eNTERFACE05, show the promising performance of the proposed model, which can greatly improve SER applications.

引用

页码：449 / 492

页数：43

共 50 条

[21] Deep Convolutional Neural Networks for Feature Extraction in Speech Emotion Recognition
Heracleous, Panikos
Mohammad, Yasser
Yoneyama, Akio
[J]. HUMAN-COMPUTER INTERACTION. RECOGNITION AND INTERACTION TECHNOLOGIES, HCI 2019, PT II, 2019, 11567 : 117 - 132
[22] Improvement on Speech Emotion Recognition Based on Deep Convolutional Neural Networks
Niu, Yafeng
Zou, Dongsheng
Niu, Yadong
He, Zhongshi
Tan, Hua
[J]. PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON COMPUTING AND ARTIFICIAL INTELLIGENCE (ICCAI 2018), 2018, : 13 - 18
[23] Recognition of emotion in music based on deep convolutional neural network
Rajib Sarkar
Sombuddha Choudhury
Saikat Dutta
Aneek Roy
Sanjoy Kumar Saha
[J]. Multimedia Tools and Applications, 2020, 79 : 765 - 783
[24] Recognition of emotion in music based on deep convolutional neural network
Sarkar, Rajib
Choudhury, Sombuddha
Dutta, Saikat
Roy, Aneek
Saha, Sanjoy Kumar
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (1-2) : 765 - 783
[25] Facial Emotion Recognition Using Deep Convolutional Neural Network
Pranav, E.
Kamal, Suraj
Chandran, Satheesh C.
Supriya, M. H.
[J]. 2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 317 - 320
[26] Convolutional Neural Network Hyperparameters Optimization for Facial Emotion Recognition
Vulpe-Grigorasi, Adrian
Grigore, Ovidiu
[J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON ADVANCED TOPICS IN ELECTRICAL ENGINEERING (ATEE), 2021,
[27] Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network
Zisad, Sharif Noor
Hossain, Mohammad Shahadat
Andersson, Karl
[J]. BRAIN INFORMATICS, BI 2020, 2020, 12241 : 287 - 296
[28] Optimizing Speech Emotion Recognition with Hilbert Curve and convolutional neural network
Yang, Zijun
Zhou, Shi
Zhang, Lifeng
Serikawa, Seiichi
[J]. Cognitive Robotics, 2024, 4 : 30 - 41
[29] Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network
Alluhaidan, Ala Saleh
Saidani, Oumaima
Jahangir, Rashid
Nauman, Muhammad Asif
Neffati, Omnia Saidani
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (08):
[30] Constructing Speech Emotion Recognition Model Based on Convolutional Neural Network
Kuo, Jong-Yih
Chen, Zhao-Ming
Lin, Hui-Chi
[J]. 2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW 2021), 2021, : 52 - 56

← 1 2 3 4 5 →