Multimodal speech emotion recognition and classification using convolutional neural network techniques

被引：0

作者：

A. Christy

S. Vaithyasubramanian

A. Jesudoss

M. D. Anto Praveena

机构：

[1] Sathyabama Institute of Science and Technology,Faculty of Computer Science and Engineering

[2] Sathyabama Institute of Science and Technology,Department of Mathematics

来源：

International Journal of Speech Technology | 2020年 / 23卷

关键词：

Speech emotion recognition; Feature extraction; Classification; SVM; CNN; Accuracy;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Emotion recognition plays a vital role in dealing with day to day interpersonal human interactions. Understanding the feeling of a person from his speech can reveal wonders in shaping social interactions. A persons emotion can be identified with the tone and pitch of his voice. The acoustic speech signal are split into short frames, fast fourier transformation is applied, and relevant features are extracted using mel-frequency cepstrum coefficients (MFCC) and modulation spectral (MS). In this paper, algorithms like linear regression, decision tree, random forest, support vector machine (SVM) and convolutional neural networks (CNN) are used for classification and prediction once relevant features are selected from speech signals. Human emotions like neutral, calm, happy, sad, fearful, disgust and surprise are classified using decision tree, random forest, support vector machine (SVM) and convolutional neural networks (CNN). We have tested our model with RAVDEES dataset and CNN has shown 78.20% accuracy in recognizing emotions compared to decision tree, random forest and SVM.

引用

页码：381 / 388

页数：7

共 50 条

[1] Multimodal speech emotion recognition and classification using convolutional neural network techniques
Christy, A.
Vaithyasubramanian, S.
Jesudoss, A.
Praveena, M. D. Anto
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 381 - 388
[2] CONVOLUTIONAL NEURAL NETWORK TECHNIQUES FOR SPEECH EMOTION RECOGNITION
Parthasarathy, Srinivas
Tashev, Ivan
[J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 121 - 125
[3] Multimodal Emotion Recognition Using a Hierarchical Fusion Convolutional Neural Network
Zhang, Yong
Cheng, Cheng
Zhang, Yidie
[J]. IEEE ACCESS, 2021, 9 : 7943 - 7951
[4] Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network
Zisad, Sharif Noor
Hossain, Mohammad Shahadat
Andersson, Karl
[J]. BRAIN INFORMATICS, BI 2020, 2020, 12241 : 287 - 296
[5] Emotion Classification Based on Convolutional Neural Network Using Speech Data
Vrebcevic, N.
Mijic, I.
Petrinovic, D.
[J]. 2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 1007 - 1012
[6] Design of a Convolutional Neural Network for Speech Emotion Recognition
Lee, Kyong Hee
Kim, Do Hyun
[J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1332 - 1335
[7] Multimodal Emotion Recognition Based on Ensemble Convolutional Neural Network
Huang, Haiping
Hu, Zhenchao
Wang, Wenming
Wu, Min
[J]. IEEE ACCESS, 2020, 8 : 3265 - 3271
[8] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
Bhangale, Kishor
Kothandaraman, Mohanaprasad
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (04) : 2341 - 2384
[9] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
Kishor Bhangale
Mohanaprasad Kothandaraman
[J]. Circuits, Systems, and Signal Processing, 2024, 43 : 2341 - 2384
[10] MFGCN: Multimodal fusion graph convolutional network for speech emotion recognition
Qi, Xin
Wen, Yujun
Zhang, Pengzhou
Huang, Heyan
[J]. Neurocomputing, 2025, 611

← 1 2 3 4 5 →