Hybrid Multi-Task Learning for End-To-End Multimodal Emotion Recognition

被引:0
|
作者
Chen, Junjie [1 ]
Li, Yongwei [2 ]
Zhao, Ziping [1 ]
Liu, Xuefei [2 ]
Wen, Zhengqi [2 ]
Tao, Jianhua [3 ,4 ]
机构
[1] Tianjin Normal Univ, Coll Comp & Informat Engn, Tianjin, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[4] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/APSIPAASC58517.2023.10317160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal emotion recognition plays a pivotal role in the advancement of natural human-computer interaction systems. Recent studies have attempted to apply multi-task learning to emotion recognition. However, the multi-task shared feature extractor of traditional methods needs to integrate the feature representations of different tasks, which may lead to the feature extractor failing to focus on the learning of emotion representations. To address this problem, we propose a hybrid multi-task learning framework for end-to-end multimodal emotion recognition, in which the primary task is emotion classification, and the auxiliary tasks are emotion regression and gender classification. This framework consists of two networks specialized in gender and emotion recognition, where the latter transfers knowledge from the former through our proposed Deep Aggregation LSTM (DA-LSTM). The DA-LSTM could more precisely capture emotional information in discourse by aggregating emotion and gender feature extractors. Experimental results on a commonly used dataset IEMOCAP demonstrate the effectiveness of our proposed method.
引用
收藏
页码:1966 / 1971
页数:6
相关论文
共 50 条
  • [1] End-to-End Multi-Task Learning with Attention
    Liu, Shikun
    Johns, Edward
    Davison, Andrew J.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1871 - 1880
  • [2] An effective multi-task learning model for end-to-end emotion-cause pair extraction
    Chenbing Li
    Jie Hu
    Tianrui Li
    Shengdong Du
    Fei Teng
    [J]. Applied Intelligence, 2023, 53 : 3519 - 3529
  • [3] An end-to-end multi-task learning to link framework for emotion-cause pair extraction
    Song, Haolin
    Song, Dawei
    [J]. 2021 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2021, 12076
  • [4] ATTENTION-AUGMENTED END-TO-END MULTI-TASK LEARNING FOR EMOTION PREDICTION FROM SPEECH
    Zhang, Zixing
    Wu, Bingwen
    Schuller, Bjoern
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6705 - 6709
  • [5] An effective multi-task learning model for end-to-end emotion-cause pair extraction
    Li, Chenbing
    Hu, Jie
    Li, Tianrui
    Du, Shengdong
    Teng, Fei
    [J]. APPLIED INTELLIGENCE, 2023, 53 (03) : 3519 - 3529
  • [6] Multi-task Learning with Attention for End-to-end Autonomous Driving
    Ishihara, Keishi
    Kanervisto, Anssi
    Miura, Jun
    Hautamaki, Ville
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2896 - 2905
  • [7] End-to-End Learning for Multimodal Emotion Recognition in Video With Adaptive Loss
    Huynh, Van Thong
    Yang, Hyung-Jeong
    Lee, Guee-Sang
    Kim, Soo-Hyung
    [J]. IEEE MULTIMEDIA, 2021, 28 (02) : 59 - 66
  • [8] Multi-objective optimization based multi-task learning for end-to-end license plates recognition
    Zhou, Xiao-Jun
    Gao, Yuan
    Li, Chao-Jie
    Yang, Chun-Hua
    [J]. Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2021, 38 (05): : 676 - 688
  • [9] Multi-Task End-to-End Model for Telugu Dialect and Speech Recognition
    Yadavalli, Aditya
    Mirishkar, Ganesh S.
    Vuppala, Anil Kumar
    [J]. INTERSPEECH 2022, 2022, : 1387 - 1391
  • [10] End-to-end Japanese Multi-dialect Speech Recognition and Dialect Identification with Multi-task Learning
    Imaizumi, Ryo
    Masumura, Ryo
    Shiota, Sayaka
    Kiya, Hitoshi
    [J]. APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2022, 11 (01)