Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network

被引:0
|
作者
Bhangale, Kishor [1 ]
Kothandaraman, Mohanaprasad [1 ]
机构
[1] VIT, SENSE, Chennai, India
关键词
Data augmentation; Deep learning; Deep convolutional neural network; Generative adversarial network; Multi-taper Mel frequency spectrogram; Speech processing; Speech emotion recognition; FEATURES; CLASSIFIERS;
D O I
10.1007/s00034-023-02562-5
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech emotion recognition (SER) has recently increased because of vast innovations in human-computer interaction and affective computing. In recent years, numerous deep learning-based schemes presented for SER have shown significant improvement over the traditional machine learning approaches. Most deep learning-based faced SER systems face challenges due to data imbalance problem that occurs due to unequal samples in the database. The input to two-dimensional CNN uses traditional MFCC for SER. It degrades the quality of deep attributes because of the higher variance, frequency resolution problem and spectral leakage problem of traditional MFCC. This paper proposed the novel Multi-taper Mel Frequency Logarithmic Spectrogram to enrich the Deep Convolutional Neural Network effectiveness for SER. Further, Generative Adversarial Network is used for speech emotion data augmentation during training to deal with data scarcity problems in SER. The performance of the proposed SER scheme is validated using the Berlin EmoDB and RAVDESS datasets. The proposed method provides SER accuracy of 96.65% and 97.12% for the EmoDB and RAVDESS dataset, respectively, and significantly improves over the recent techniques.
引用
收藏
页码:2341 / 2384
页数:44
相关论文
共 50 条
  • [1] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
    Kishor Bhangale
    Mohanaprasad Kothandaraman
    [J]. Circuits, Systems, and Signal Processing, 2024, 43 : 2341 - 2384
  • [2] Emotion Recognition Based on EEG Using Generative Adversarial Nets and Convolutional Neural Network
    Pan, Bo
    Zheng, Wei
    [J]. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
  • [3] Deep Convolutional Generative Adversarial Network and Convolutional Neural Network for Smoke Detection
    Yin, Hang
    Wei, Yurong
    Liu, Hedan
    Liu, Shuangyin
    Liu, Chuanyun
    Gao, Yacui
    [J]. COMPLEXITY, 2020, 2020
  • [4] Improvement of Speech Emotion Recognition by Deep Convolutional Neural Network and Speech Features
    Mohanty, Aniruddha
    Cherukuri, Ravindranath C.
    Prusty, Alok Ranjan
    [J]. THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 117 - 129
  • [5] Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network
    Badshah, Abdul Malik
    Ahmad, Jamil
    Rahim, Nasir
    Baik, Sung Wook
    [J]. 2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 125 - 129
  • [6] Face Generation using Deep Convolutional Generative Adversarial Neural Network
    Devaki, P.
    Kumar, Prasanna C. B.
    Kaviraj, S.
    Ramprasath, A.
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (11): : 20 - 23
  • [7] Speech Emotion Recognition Using Deep Convolutional Neural Network and Simple Recurrent Unit
    Jiang, Pengxu
    Fu, Hongliang
    Tao, Huawei
    [J]. ENGINEERING LETTERS, 2019, 27 (04) : 901 - 906
  • [8] Transforming the Emotion in Speech using a Generative Adversarial Network
    Yasuda, Kenji
    Orihara, Ryohei
    Sei, Yuichi
    Tahara, Yasuyuki
    Ohsuga, Akihiko
    [J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 427 - 434
  • [9] Facial Emotion Recognition Using Deep Convolutional Neural Network
    Pranav, E.
    Kamal, Suraj
    Chandran, Satheesh C.
    Supriya, M. H.
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 317 - 320
  • [10] Image recognition of interference fringes in polishing by convolutional neural network with data augmentation by deep convolutional generative adversarial network
    Chen, Yi-Huei
    Lin, Wei-Ting
    Liu, Chun-Wei
    [J]. OPTICAL ENGINEERING, 2022, 61 (04)