Enhanced Speech Emotion Recognition Using DCGAN-Based Data Augmentation

被引:6
|
作者
Baek, Ji-Young [1 ]
Lee, Seok-Pil [2 ]
Tsihrintzis, George A.
机构
[1] Sangmyung Univ, Grad Sch, Dept Comp Sci, Seoul 03016, South Korea
[2] Sangmyung Univ, Dept Intelligent IoT, Seoul 03016, South Korea
关键词
artificial intelligence; deep learning; DCGAN; data augmentation; speech emotion recognition; LSTM; GMM;
D O I
10.3390/electronics12183966
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although emotional speech recognition has received increasing emphasis in research and applications, it remains challenging due to the diversity and complexity of emotions and limited datasets. To address these limitations, we propose a novel approach utilizing DCGAN to augment data from the RAVDESS and EmoDB databases. Then, we assess the efficacy of emotion recognition using mel-spectrogram data by utilizing a model that combines CNN and BiLSTM. The preliminary experimental results reveal that the suggested technique contributes to enhancing the emotional speech identification performance. The results of this study provide directions for further development in the field of emotional speech recognition and the potential for practical applications.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] DCGAN-Based Data Augmentation for Tomato Leaf Disease Identification
    Wu, Qiufeng
    Chen, Yiping
    Meng, Jun
    [J]. IEEE ACCESS, 2020, 8 : 98716 - 98728
  • [2] Speech Emotion Recognition Using Data Augmentation
    Kapoor, Tanisha
    Ganguly, Arnaja
    Rajeswari, D.
    [J]. 2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [3] Speech emotion recognition using data augmentation
    V. M. Praseetha
    P. P. Joby
    [J]. International Journal of Speech Technology, 2022, 25 : 783 - 792
  • [4] Speech emotion recognition using data augmentation
    Praseetha, V. M.
    Joby, P. P.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 25 (4) : 783 - 792
  • [5] DCGAN-Based Image Data Augmentation in Rawhide Stick Products' Defect Detection
    Ding, Shuhui
    Guo, Zhongyuan
    Chen, Xiaolong
    Li, Xueyi
    Ma, Fai
    [J]. ELECTRONICS, 2024, 13 (11)
  • [6] Data Augmentation using GANs for Speech Emotion Recognition
    Chatziagapi, Aggelina
    Paraskevopoulos, Georgios
    Sgouropoulos, Dimitris
    Pantazopoulos, Georgios
    Nikandrou, Malvina
    Giannakopoulos, Theodoros
    Katsamanis, Athanasios
    Potamianos, Alexandros
    Narayanan, Shrikanth
    [J]. INTERSPEECH 2019, 2019, : 171 - 175
  • [7] DCGAN-based Scheme for Radar Spectrogram Augmentation in Human Activity classification
    Mi, Ye
    Jing, Xiaojun
    Mu, Junsheng
    Li, Xinyu
    He, Yuan
    [J]. 2018 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION & USNC/URSI NATIONAL RADIO SCIENCE MEETING, 2018, : 1973 - 1974
  • [8] Strong Generalized Speech Emotion Recognition Based on Effective Data Augmentation
    Tao, Huawei
    Shan, Shuai
    Hu, Ziyi
    Zhu, Chunhua
    Ge, Hongyi
    [J]. ENTROPY, 2023, 25 (01)
  • [9] CycleGAN-based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition
    Bao, Fang
    Neumann, Michael
    Ngoc Thang Vu
    [J]. INTERSPEECH 2019, 2019, : 2828 - 2832
  • [10] Adversarial Data Augmentation Network for Speech Emotion Recognition
    Yi, Lu
    Mak, Man-Wai
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 529 - 534