Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation

被引:0
|
作者
Fu, Hongliang [1 ]
Li, Qianqian [1 ]
Tao, Huawei [1 ]
Zhu, Chunhua [1 ]
Xie, Yue [2 ]
Guo, Ruxue [3 ]
机构
[1] Henan Univ Technol, Key Lab Grain Informat Proc & Control, Minist Educ, Zhengzhou 450001, Peoples R China
[2] Nanjing Inst Technol, Sch Commun Engn, Nanjing 211167, Peoples R China
[3] IFLYTEK Res, Hefei 230088, Peoples R China
基金
中国国家自然科学基金;
关键词
cross-corpus speech emotion recognition; causal representation learning; domain adaptation;
D O I
10.1587/transinf.2023EDL8087
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech emotion recognition (SER) is a key research technology to realize the third generation of artificial intelligence, which is widely used in human-computer interaction, emotion diagnosis, interpersonal communication and other fields. However, the aliasing of language and semantic information in speech tends to distort the alignment of emotion features, which affects the performance of cross-corpus SER system. This paper proposes a cross-corpus SER model based on causal emotion information representation (CEIR). The model uses the reconstruction loss of the deep autoencoder network and the source domain label information to realize the preliminary separation of causal features. Then, the causal correlation matrix is constructed, and the local maximum mean difference (LMMD) feature alignment technology is combined to make the causal features of different dimensions jointly distributed independent. Finally, the supervised fine-tuning of labeled data is used to achieve effective extraction of causal emotion information. The experimental results show that the average unweighted average recall (UAR) of the proposed algorithm is increased by 3.4% to 7.01% compared with the latest partial algorithms in the field.
引用
收藏
页码:1097 / 1100
页数:4
相关论文
共 50 条
  • [1] A CROSS-CORPUS STUDY ON SPEECH EMOTION RECOGNITION
    Milner, Rosanna
    Jalal, Md Asif
    Ng, Raymond W. M.
    Hain, Thomas
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 304 - 311
  • [2] Cross-Corpus Speech Emotion Recognition Based on Hybrid Neural Networks
    Rehman, Abdul
    Liu, Zhen-Tao
    Li, Dan-Yun
    Wu, Bao-Han
    [J]. PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7464 - 7468
  • [3] Cross-Corpus Speech Emotion Recognition Based on Sparse Subspace Transfer Learning
    Zhao, Keke
    Song, Peng
    Zhang, Wenjing
    Zhang, Weijian
    Li, Shaokai
    Chen, Dongliang
    Zheng, Wenming
    [J]. BIOMETRIC RECOGNITION (CCBR 2021), 2021, 12878 : 466 - 473
  • [4] A STUDY ON CROSS-CORPUS SPEECH EMOTION RECOGNITION AND DATA AUGMENTATION
    Braunschweiler, Norbert
    Doddipatla, Rama
    Keizer, Simon
    Stoyanchev, Svetlana
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 24 - 30
  • [5] Few Shot Learning Guided by Emotion Distance for Cross-corpus Speech Emotion Recognition
    Yue, Pengcheng
    Wu, Yanfeng
    Qu, Leyuan
    Zheng, Shukai
    Zhao, Shuyuan
    Li, Taihao
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1008 - 1012
  • [6] Auditory attention model based on Chirplet for cross-corpus speech emotion recognition
    [J]. Zhao, Li (zhaoli@seu.edu.cn), 1600, Southeast University (32):
  • [7] CROSS-CORPUS EEG-BASED EMOTION RECOGNITION
    Rayatdoost, Soheil
    Soleymani, Mohammad
    [J]. 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,
  • [8] Synthesized speech for model training in cross-corpus recognition of human emotion
    Schuller, Bjorn
    Zhang, Zixing
    Weninger, Felix
    Burkhardt, Felix
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (03) : 313 - 323
  • [9] Implicitly Aligning Joint Distributions for Cross-Corpus Speech Emotion Recognition
    Lu, Cheng
    Zong, Yuan
    Tang, Chuangao
    Lian, Hailun
    Chang, Hongli
    Zhu, Jie
    Li, Sunan
    Zhao, Yan
    [J]. ELECTRONICS, 2022, 11 (17)
  • [10] DOMAIN GENERALIZATION WITH TRIPLET NETWORK FOR CROSS-CORPUS SPEECH EMOTION RECOGNITION
    Lee, Shi-wook
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 389 - 396