Transformer-based transfer learning and multi-task learning for improving the performance of speech emotion recognition

被引:0
|
作者
Park, Sunchan [1 ]
Kim, Hyung Soon [1 ]
机构
[1] Pusan Natl Univ, Dept Elect Engn, 2 Busandaehak Ro 63Beon Gil, Busan 46241, South Korea
来源
关键词
Speech emotion recognition; Transformer; Transfer learning; Multi-task learning;
D O I
10.7776/ASK.2021.40.5.515
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
It is hard to prepare sufficient training data for speech emotion recognition due to the difficulty of emotion labeling. In this paper, we apply transfer learning with large-scale training data for speech recognition on a transformer-based model to improve the performance of speech emotion recognition. In addition, we propose a method to utilize context information without decoding by multi-task learning with speech recognition. According to the speech emotion recognition experiments using the IEMOCAP dataset, our model achieves a weighted accuracy of 70.6 % and an unweighted accuracy of 71.6 %, which shows that the proposed method is effective in improving the performance of speech emotion recognition.
引用
收藏
页码:515 / 522
页数:8
相关论文
共 50 条
  • [31] MULTI-OBJECTIVE MULTI-TASK LEARNING ON RNNLM FOR SPEECH RECOGNITION
    Song, Minguang
    Zhao, Yunxin
    Wang, Shaojun
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 197 - 203
  • [32] Attention-based LSTM with Multi-task Learning for Distant Speech Recognition
    Zhang, Yu
    Zhang, Pengyuan
    Yan, Yonghong
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3857 - 3861
  • [33] Abusive Content Detection in Arabic Tweets Using Multi-Task Learning and Transformer-Based Models
    Alrashidi, Bedour
    Jamal, Amani
    Alkhathlan, Ali
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (10):
  • [34] Multi-modal embeddings using multi-task learning for emotion recognition
    Khare, Aparna
    Parthasarathy, Srinivas
    Sundaram, Shiva
    [J]. INTERSPEECH 2020, 2020, : 384 - 388
  • [35] Unified Transformer Multi-Task Learning for Intent Classification With Entity Recognition
    Benayas Alamos, Alberto Jose
    Hashempou, Reyhaneh
    Rumble, Damian
    Jameel, Shoaib
    De Amorim, Renato Cordeiro
    [J]. IEEE ACCESS, 2021, 9 : 147306 - 147314
  • [36] Multi-task Learning for Multi-modal Emotion Recognition and Sentiment Analysis
    Akhtar, Md Shad
    Chauhan, Dushyant Singh
    Ghosal, Deepanway
    Poria, Soujanya
    Ekbal, Asif
    Bhattacharyya, Pushpak
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 370 - 379
  • [37] MT-TCCT: Multi-task Learning for Multimodal Emotion Recognition
    Wang, Yandan
    Chen, Zhongtang
    Chen, Shuang
    Zhu, Yu
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 429 - 442
  • [38] Multi-Label Multimodal Emotion Recognition With Transformer-Based Fusion and Emotion-Level Representation Learning
    Le, Hoai-Duy
    Lee, Guee-Sang
    Kim, Soo-Hyung
    Kim, Seungwon
    Yang, Hyung-Jeong
    [J]. IEEE ACCESS, 2023, 11 : 14742 - 14751
  • [39] Feature-Enhanced Multi-Task Learning for Speech Emotion Recognition Using Decision Trees and LSTM
    Wang, Chun
    Shen, Xizhong
    [J]. ELECTRONICS, 2024, 13 (14)
  • [40] Transfer Learning for Speech Emotion Recognition
    Han Zhijie
    Zhao, Huijuan
    Wang, Ruchuan
    [J]. 2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 96 - 99