Model Smoothing using Virtual Adversarial Training for Speech Emotion Estimation using Spontaneity

被引:0
|
作者
Kuwahara, Toyoaki [1 ]
Orihara, Ryohei [1 ]
Sei, Yuichi [1 ]
Tahara, Yasuyuki [1 ]
Ohsuga, Akihiko [1 ]
机构
[1] Univ Electrocommun, Grad Sch Informat & Engn, Tokyo, Japan
关键词
Deep Learning; Cross Corpus; Virtual Adversarial Training; Emotion Recognition; Speech Processing; Spontaneity; DEEP NEURAL-NETWORK; PERCEPTION;
D O I
10.5220/0008958405700577
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech-based emotion estimation increases accuracy through the development of deep learning. However, most emotion estimation using deep learning requires supervised learning, and it is difficult to obtain large datasets used for training. In addition, if the training data environment and the actual data environment are significantly different, the problem is that the accuracy of emotion estimation is reduced. Therefore, in this study, to solve these problems, we propose a emotion estimation model using virtual adversarial training (VAT), a semi-supervised learning method that improves the robustness of the model. Furthermore, research on the spontaneity of speech has progressed year by year, and recent studies have shown that the accuracy of emotion classification is improved when spontaneity is taken into account. We would like to investigate the effect of the spontaneity in a cross-language situation. First, VAT hyperparameters were first set by a preliminary experiment using a single corpus. Next, the robustness of the model generated by the evaluation experiment by the cross corpus was shown. Finally, we evaluate the accuracy of emotion estimation by considering spontaneity and showed improvement in the accuracy of the model using VAT by considering spontaneity.
引用
收藏
页码:570 / 577
页数:8
相关论文
共 50 条
  • [31] Speech Emotion Recognition using DWT
    Lalitha, S.
    Mudupu, Anoop
    Nandyala, Bala Visali
    Munagala, Renuka
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2015, : 20 - 23
  • [32] A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
    Malik, Mohammad Ibrahim
    Latif, Siddique
    Jurdak, Raja
    Schuller, Bjoern W.
    INTERSPEECH 2023, 2023, : 646 - 650
  • [33] Speech emotion recognition by using complex MFCC and deep sequential model
    Suprava Patnaik
    Multimedia Tools and Applications, 2023, 82 : 11897 - 11922
  • [34] Speech Based Multiple Emotion Classification Model Using Deep Learning
    Patneedi, Shakti Swaroop
    Kumari, Nandini
    ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 648 - 659
  • [35] Speech emotion recognition by using complex MFCC and deep sequential model
    Patnaik, Suprava
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (08) : 11897 - 11922
  • [36] Seismic Horizon Identification Using Semi-Supervised Learning With Virtual Adversarial Training
    Wang, Fu
    Wu, Xinming
    Wang, Huazhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [37] Virtual Cinematography Using Optimization and Temporal Smoothing
    Litteneker, Alan
    Terzopoulos, Demetri
    MIG'17: PROCEEDINGS OF THE TENTH INTERNATIONAL CONFERENCE ON MOTION IN GAMES, 2017,
  • [38] Boosting Adversarial Robustness using Feature Level Stochastic Smoothing
    Addepalli, Sravanti
    Jain, Samyak
    Sriramanan, Gaurang
    Babu, R. Venkatesh
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 93 - 102
  • [39] Development of Interactive Robot -Emotion Estimation System Using Speech by 1dCNN-
    Kawachi, Yugo
    Hayashi, Eiji
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB 2021), 2021, : 760 - 763
  • [40] Development of Interactive Robot - Emotion Estimation System Using Speech by 1dCNN-
    Kawachi, Yugo
    Hayashi, Eiji
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB 2021), 2021, : P96 - P96