Improved Speech Emotion Classification Using Deep Neural Network

被引:0
|
作者
Saeed, Mariwan Hama [1 ]
机构
[1] Univ Halabja, Coll Basic Educ, Halabja, Iraq
关键词
Emotion classification; Deep learning; Deep neural network; Librosa library; MFCC; Mel-spectrogram frequency; Chroma; Poly features; RECOGNITION; MUSIC;
D O I
10.1007/s00034-023-02446-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech emotion recognition (SER), which has gained greater attention in recent years, is a key aspect of the human-computer interaction process. However, a wide range of strategies has been offered in SER, and these approaches have yet to increase performance. In this study, a deep neural network model for classifying voice emotions is suggested. It is divided into three stages: feature extraction, normalization, and emotion recognition. The Librosa Python Toolkit is used to acquire the MFCC, Mel-Spectrogram Frequency, Chroma, and Poly Features during feature extraction. Data augmentation for the minority class using SMOTE (synthetic minority oversampling technique) and the Min-Max scaler for the normalization process were used. The model was evaluated on three frequently used languages: German, English, and French, using the Berlin Emotional Speech Database (EMODB), Surrey Audio-Visual Expressed Emotion Dataset (SAVEE), and the Canadian French Emotional (CaFE) speech datasets. The recognition rates of unweighted accuracy of 95% on EMODB, 90% on SAVEE, and 92% on CaFE are gained in speaker-dependent experiments. The results show that the suggested method is capable of efficiently recognizing emotions and outperformed the other approaches utilized for comparison in terms of performance indicators.
引用
收藏
页码:7357 / 7376
页数:20
相关论文
共 50 条
  • [1] Improved Speech Emotion Classification Using Deep Neural Network
    Mariwan Hama Saeed
    [J]. Circuits, Systems, and Signal Processing, 2023, 42 : 7357 - 7376
  • [2] A Study on Speech Emotion Recognition Using a Deep Neural Network
    Lee, Kyong Hee
    Choi, Hyun Kyun
    Jang, Byung Tae
    Kim, Do Hyun
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1162 - 1165
  • [3] Emotion classification in poetry text using deep neural network
    Khattak, Asad
    Asghar, Muhammad Zubair
    Khalid, Hassan Ali
    Ahmad, Hussain
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (18) : 26223 - 26244
  • [4] Emotion classification in poetry text using deep neural network
    Asad Khattak
    Muhammad Zubair Asghar
    Hassan Ali Khalid
    Hussain Ahmad
    [J]. Multimedia Tools and Applications, 2022, 81 : 26223 - 26244
  • [5] Emotion Classification Based on Convolutional Neural Network Using Speech Data
    Vrebcevic, N.
    Mijic, I.
    Petrinovic, D.
    [J]. 2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 1007 - 1012
  • [6] Active Learning for Speech Emotion Recognition Using Deep Neural Network
    Abdelwahab, Mohammed
    Busso, Carlos
    [J]. 2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [7] Dari Speech Classification Using Deep Convolutional Neural Network
    Dawodi, Mursal
    Baktash, Jawid Ahamd
    Wada, Tomohisa
    Alam, Najwa
    Joya, Mohammad Zarif
    [J]. 2020 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS 2020), 2020, : 110 - 113
  • [8] Music Emotion Classification Method Using Improved Deep Belief Network
    Tong, Guiying
    [J]. MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [9] Speech Emotion Classification Using Deep Learning
    Mishra, Siba Prasad
    Warule, Pankaj
    Deb, Suman
    [J]. PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 19 - 31
  • [10] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (04) : 2341 - 2384