Model Smoothing using Virtual Adversarial Training for Speech Emotion Estimation using Spontaneity

被引:0
|
作者
Kuwahara, Toyoaki [1 ]
Orihara, Ryohei [1 ]
Sei, Yuichi [1 ]
Tahara, Yasuyuki [1 ]
Ohsuga, Akihiko [1 ]
机构
[1] Univ Electrocommun, Grad Sch Informat & Engn, Tokyo, Japan
关键词
Deep Learning; Cross Corpus; Virtual Adversarial Training; Emotion Recognition; Speech Processing; Spontaneity; DEEP NEURAL-NETWORK; PERCEPTION;
D O I
10.5220/0008958405700577
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech-based emotion estimation increases accuracy through the development of deep learning. However, most emotion estimation using deep learning requires supervised learning, and it is difficult to obtain large datasets used for training. In addition, if the training data environment and the actual data environment are significantly different, the problem is that the accuracy of emotion estimation is reduced. Therefore, in this study, to solve these problems, we propose a emotion estimation model using virtual adversarial training (VAT), a semi-supervised learning method that improves the robustness of the model. Furthermore, research on the spontaneity of speech has progressed year by year, and recent studies have shown that the accuracy of emotion classification is improved when spontaneity is taken into account. We would like to investigate the effect of the spontaneity in a cross-language situation. First, VAT hyperparameters were first set by a preliminary experiment using a single corpus. Next, the robustness of the model generated by the evaluation experiment by the cross corpus was shown. Finally, we evaluate the accuracy of emotion estimation by considering spontaneity and showed improvement in the accuracy of the model using VAT by considering spontaneity.
引用
收藏
页码:570 / 577
页数:8
相关论文
共 50 条
  • [1] SMOOTHING MODEL PREDICTIONS USING ADVERSARIAL TRAINING PROCEDURES FOR SPEECH BASED EMOTION RECOGNITION
    Sahu, Saurabh
    Gupta, Rahul
    Sivaraman, Ganesh
    Espy-Wilson, Carol
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4934 - 4938
  • [2] Transforming the Emotion in Speech using a Generative Adversarial Network
    Yasuda, Kenji
    Orihara, Ryohei
    Sei, Yuichi
    Tahara, Yasuyuki
    Ohsuga, Akihiko
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 427 - 434
  • [3] Regression estimation model for emotion and intensity of speech using perception rating
    Kawase, Megumi
    Nakayama, Minoru
    2022 26TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV), 2022, : 173 - 179
  • [4] Regression estimation model for emotion and intensity of speech using perception rating
    Kawase, Megumi
    Nakayama, Minoru
    Proceedings of the International Conference on Information Visualisation, 2022, 2022-July : 173 - 179
  • [5] On Enhancing Speech Emotion Recognition using Generative Adversarial Networks
    Sahu, Saurabh
    Gupta, Rahul
    Espy-Wilson, Carol
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3693 - 3697
  • [6] ADVERSARIAL TRAINING OF END-TO-END SPEECH RECOGNITION USING A CRITICIZING LANGUAGE MODEL
    Liu, Alexander H.
    Lee, Hung-yi
    Lee, Lin-shan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6176 - 6180
  • [7] An Emotion Estimation from Human Speech Using Speech Recognition and Speech Synthesize
    Kurematsu, Masaki
    Ohashi, Marina
    Kinosita, Orimi
    Hakura, Jun
    Fujita, Hamido
    NEW TRENDS IN SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES, 2008, 182 : 278 - 289
  • [8] Noise Adaptive Speech Enhancement using Domain Adversarial Training
    Liao, Chien-Feng
    Tsao, Yu
    Lee, Hung-Yi
    Wang, Hsin-Min
    INTERSPEECH 2019, 2019, : 3148 - 3152
  • [9] Speech Emotion Recognition in the Wild using Multi-task and Adversarial Learning
    Parry, Jack
    DeMattos, Eric
    Klementiev, Anita
    Ind, Axel
    Morse-Kopp, Daniela
    Clarke, Georgia
    Palaz, Dimitri
    INTERSPEECH 2022, 2022, : 1158 - 1162
  • [10] An Adversarial Training Based Speech Emotion Classifier With Isolated Gaussian Regularization
    Fu, Changzeng
    Liu, Chaoran
    Ishi, Carlos Toshinori
    Ishiguro, Hiroshi
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2361 - 2374