Multimodal speech emotion recognition and classification using convolutional neural network techniques

被引:0
|
作者
A. Christy
S. Vaithyasubramanian
A. Jesudoss
M. D. Anto Praveena
机构
[1] Sathyabama Institute of Science and Technology,Faculty of Computer Science and Engineering
[2] Sathyabama Institute of Science and Technology,Department of Mathematics
关键词
Speech emotion recognition; Feature extraction; Classification; SVM; CNN; Accuracy;
D O I
暂无
中图分类号
学科分类号
摘要
Emotion recognition plays a vital role in dealing with day to day interpersonal human interactions. Understanding the feeling of a person from his speech can reveal wonders in shaping social interactions. A persons emotion can be identified with the tone and pitch of his voice. The acoustic speech signal are split into short frames, fast fourier transformation is applied, and relevant features are extracted using mel-frequency cepstrum coefficients (MFCC) and modulation spectral (MS). In this paper, algorithms like linear regression, decision tree, random forest, support vector machine (SVM) and convolutional neural networks (CNN) are used for classification and prediction once relevant features are selected from speech signals. Human emotions like neutral, calm, happy, sad, fearful, disgust and surprise are classified using decision tree, random forest, support vector machine (SVM) and convolutional neural networks (CNN). We have tested our model with RAVDEES dataset and CNN has shown 78.20% accuracy in recognizing emotions compared to decision tree, random forest and SVM.
引用
收藏
页码:381 / 388
页数:7
相关论文
共 50 条
  • [1] Multimodal speech emotion recognition and classification using convolutional neural network techniques
    Christy, A.
    Vaithyasubramanian, S.
    Jesudoss, A.
    Praveena, M. D. Anto
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 381 - 388
  • [2] CONVOLUTIONAL NEURAL NETWORK TECHNIQUES FOR SPEECH EMOTION RECOGNITION
    Parthasarathy, Srinivas
    Tashev, Ivan
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 121 - 125
  • [3] Multimodal Emotion Recognition Using a Hierarchical Fusion Convolutional Neural Network
    Zhang, Yong
    Cheng, Cheng
    Zhang, Yidie
    [J]. IEEE ACCESS, 2021, 9 : 7943 - 7951
  • [4] Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network
    Zisad, Sharif Noor
    Hossain, Mohammad Shahadat
    Andersson, Karl
    [J]. BRAIN INFORMATICS, BI 2020, 2020, 12241 : 287 - 296
  • [5] Emotion Classification Based on Convolutional Neural Network Using Speech Data
    Vrebcevic, N.
    Mijic, I.
    Petrinovic, D.
    [J]. 2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 1007 - 1012
  • [6] Design of a Convolutional Neural Network for Speech Emotion Recognition
    Lee, Kyong Hee
    Kim, Do Hyun
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1332 - 1335
  • [7] Multimodal Emotion Recognition Based on Ensemble Convolutional Neural Network
    Huang, Haiping
    Hu, Zhenchao
    Wang, Wenming
    Wu, Min
    [J]. IEEE ACCESS, 2020, 8 : 3265 - 3271
  • [8] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (04) : 2341 - 2384
  • [9] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
    Kishor Bhangale
    Mohanaprasad Kothandaraman
    [J]. Circuits, Systems, and Signal Processing, 2024, 43 : 2341 - 2384
  • [10] MFGCN: Multimodal fusion graph convolutional network for speech emotion recognition
    Qi, Xin
    Wen, Yujun
    Zhang, Pengzhou
    Huang, Heyan
    [J]. Neurocomputing, 2025, 611