GAN-based Data Generation for Speech Emotion Recognition

被引:14
|
作者
Eskimez, Sefik Emre [1 ]
Dimitriadis, Dimitrios [1 ]
Gmyr, Robert [1 ]
Kumanati, Kenichi [1 ]
机构
[1] Microsoft, One Microsoft Way, Redmond, WA 98052 USA
来源
关键词
speech emotion recognition; generative adversarial networks; data augmentation;
D O I
10.21437/Interspeech.2020-2898
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this work, we propose a GAN-based method to generate synthetic data for speech emotion recognition. Specifically, we investigate the usage of GANs for capturing the data manifold when the data is eyes-off, i.e., where we can train networks using the data but cannot copy it from the clients. We propose a CNN-based GAN with spectral normalization on both the generator and discriminator, both of which are pre-trained on large unlabeled speech corpora. We show that our method provides better speech emotion recognition performance than a strong baseline. Furthermore, we show that even after the data on the client is lost, our model can generate similar data that can be used for model bootstrapping in the future. Although we evaluated our method for speech emotion recognition, it can be applied to other tasks.
引用
收藏
页码:3446 / 3450
页数:5
相关论文
共 50 条
  • [31] Enhanced Speech Emotion Recognition Using DCGAN-Based Data Augmentation
    Baek, Ji-Young
    Lee, Seok-Pil
    Tsihrintzis, George A.
    [J]. ELECTRONICS, 2023, 12 (18)
  • [32] GAN-based generation of realistic compressible-flow samples from incomplete data
    Abaidi, R.
    Adams, N. A.
    [J]. COMPUTERS & FLUIDS, 2024, 269
  • [33] GAN-based Intrusion Detection Data Enhancement
    Fu, Wei
    Qian, Liping
    Zhu, Xiaohui
    [J]. PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 2739 - 2744
  • [34] Data Augmentation using GANs for Speech Emotion Recognition
    Chatziagapi, Aggelina
    Paraskevopoulos, Georgios
    Sgouropoulos, Dimitris
    Pantazopoulos, Georgios
    Nikandrou, Malvina
    Giannakopoulos, Theodoros
    Katsamanis, Athanasios
    Potamianos, Alexandros
    Narayanan, Shrikanth
    [J]. INTERSPEECH 2019, 2019, : 171 - 175
  • [35] Adversarial Data Augmentation Network for Speech Emotion Recognition
    Yi, Lu
    Mak, Man-Wai
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 529 - 534
  • [36] A Multi-Resolution Approach to GAN-Based Speech Enhancement
    Kim, Hyung Yong
    Yoon, Ji Won
    Cheon, Sung Jun
    Kang, Woo Hyun
    Kim, Nam Soo
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (02): : 1 - 15
  • [37] Speech Emotion Recognition Based on Improved MFCC
    Wang, Yan
    Hu, Weiping
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
  • [38] Speech Emotion Recognition Based on Sparse Representation
    Yan, Jingjie
    Wang, Xiaolan
    Gu, Weiyi
    Ma, Lili
    [J]. ARCHIVES OF ACOUSTICS, 2013, 38 (04) : 465 - 470
  • [39] Speech Emotion Recognition Based on PCA and CHMM
    Ke, Xianxin
    Cao, Bin
    Bai, Jiaojiao
    Yu, Qichao
    Yang, Dezhi
    [J]. PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 667 - 671
  • [40] Speech Emotion Recognition Based on Learning Automata in
    Motamed, Sara
    Setayeshi, Saeed
    Farhoudi, Zeinab
    Ahmadi, Ali
    [J]. JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2014, 12 (03): : 173 - 185