Implicitly Aligning Joint Distributions for Cross-Corpus Speech Emotion Recognition

被引:0
|
作者
Lu, Cheng [1 ,2 ]
Zong, Yuan [1 ,3 ]
Tang, Chuangao [1 ,3 ]
Lian, Hailun [1 ,2 ]
Chang, Hongli [1 ,2 ]
Zhu, Jie [1 ,2 ]
Li, Sunan [1 ,2 ]
Zhao, Yan [1 ,2 ]
机构
[1] Southeast Univ, Minist Educ, Key Lab Child Dev & Learning Sci, Nanjing 210096, Peoples R China
[2] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Peoples R China
[3] Southeast Univ, Sch Biol Sci & Med Engn, Nanjing 210096, Peoples R China
关键词
cross-corpus speech emotion recognition; domain adaptation; transfer subspace learning; marginal distribution; conditional distribution; KERNEL; ADAPTATION; NETWORK;
D O I
10.3390/electronics11172745
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we investigate the problem of cross-corpus speech emotion recognition (SER), in which the training (source) and testing (target) speech samples belong to different corpora. This case thus leads to a feature distribution mismatch between the source and target speech samples. Hence, the performance of most existing SER methods drops sharply. To solve this problem, we propose a simple yet effective transfer subspace learning method called joint distribution implicitly aligned subspace learning (JIASL). The basic idea of JIASL is very straightforward, i.e., building an emotion discriminative and corpus invariant linear regression model under an implicit distribution alignment strategy. Following this idea, we first make use of the source speech features and emotion labels to endow such a regression model with emotion-discriminative ability. Then, a well-designed reconstruction regularization term, jointly considering the marginal and conditional distribution alignments between the speech samples in both corpora, is adopted to implicitly enable the regression model to predict the emotion labels of target speech samples. To evaluate the performance of our proposed JIASL, extensive cross-corpus SER experiments are carried out, and the results demonstrate the promising performance of the proposed JIASL in coping with the tasks of cross-corpus SER.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A CROSS-CORPUS STUDY ON SPEECH EMOTION RECOGNITION
    Milner, Rosanna
    Jalal, Md Asif
    Ng, Raymond W. M.
    Hain, Thomas
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 304 - 311
  • [2] CROSS-CORPUS SPEECH EMOTION RECOGNITION USING JOINT DISTRIBUTION ADAPTIVE REGRESSION
    Zhang, Jiacheng
    Jiang, Lin
    Zong, Yuan
    Zheng, Wenming
    Zhao, Li
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3790 - 3794
  • [3] Cross-Corpus Speech Emotion Recognition Based on Joint Transfer Subspace Learning and Regression
    Zhang, Weijian
    Song, Peng
    Chen, Dongliang
    Sheng, Chao
    Zhang, Wenjing
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 588 - 598
  • [4] A STUDY ON CROSS-CORPUS SPEECH EMOTION RECOGNITION AND DATA AUGMENTATION
    Braunschweiler, Norbert
    Doddipatla, Rama
    Keizer, Simon
    Stoyanchev, Svetlana
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 24 - 30
  • [5] Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation
    Fu, Hongliang
    Li, Qianqian
    Tao, Huawei
    Zhu, Chunhua
    Xie, Yue
    Guo, Ruxue
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (08) : 1097 - 1100
  • [6] Synthesized speech for model training in cross-corpus recognition of human emotion
    Schuller, Bjorn
    Zhang, Zixing
    Weninger, Felix
    Burkhardt, Felix
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (03) : 313 - 323
  • [7] Cross-Corpus Speech Emotion Recognition Based on Hybrid Neural Networks
    Rehman, Abdul
    Liu, Zhen-Tao
    Li, Dan-Yun
    Wu, Bao-Han
    [J]. PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7464 - 7468
  • [8] DOMAIN GENERALIZATION WITH TRIPLET NETWORK FOR CROSS-CORPUS SPEECH EMOTION RECOGNITION
    Lee, Shi-wook
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 389 - 396
  • [9] A Cross-Corpus Recognition of Emotional Speech
    Xiao, Zhongzhe
    Wu, Di
    Zhang, Xiaojun
    Tao, Zhi
    [J]. PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2016, : 42 - 46
  • [10] Deep Cross-Corpus Speech Emotion Recognition: Recent Advances and Perspectives
    Zhang, Shiqing
    Liu, Ruixin
    Tao, Xin
    Zhao, Xiaoming
    [J]. FRONTIERS IN NEUROROBOTICS, 2021, 15