Deep Transductive Transfer Regression Network for Cross-Corpus Speech Emotion Recognition

被引：3

作者：

Zhao, Yan ^{[1
,2
]}

Wang, Jincen ^{[3
]}

Ye, Ru ^{[4
]}

Zong, Yuan ^{[1
]}

Zheng, Wenming ^{[1
]}

Zhao, Li ^{[2
]}

机构：

[1] Southeast Univ, Minist Educ, Key Lab Child Dev & Learning Sci, Nanjing, Peoples R China

[2] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Peoples R China

[3] Nanjing Univ Informat Sci & Technol, Sch Elect & Informat Engn, Nanjing, Peoples R China

[4] Nanjing Univ Informat Sci & Technol, Changwang Sch Honors, Nanjing, Peoples R China

来源：

INTERSPEECH 2022 | 2022年

关键词：

Cross-corpus speech emotion recognition; speech emotion recognition; domain adaptation; transfer learning; deep learning; FEATURES; KERNEL;

D O I：

10.21437/Interspeech.2022-679

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we focus on the research of cross-corpus speech emotion recognition (SER), in which the training (source) and testing (target) speech samples come from different corpora leading to a feature distribution gap between them. To solve this problem, we propose a simple yet effective method called deep transductive transfer regression network (DTTRN). The basic idea of DTTRN is to learn a corpus invariant deep neural network to bridge the source and target speech samples and their label information. Following this idea, we make use of a transductive learning way to enforce a deep regressor to build the relationship between the features and emotional labels jointly in both speech corpora. Meanwhile, we also design an emotion guided regularization term for learning DTTRN by aligning source and target speech samples' feature distributions from three different scales. Thus, the DTTRN only absorbing the label information provided by source speech samples is able to correctly predict the emotions of the target ones. To evaluate DTTRN, we conduct extensive cross-corpus SER experiments on EmoDB, CASIA, and eNTERFACE corpora. Experimental results show the superior performance of our DTTRN over recent state-of-the-art deep transfer learning methods in dealing with the cross-corpus SER tasks.

引用

页码：371 / 375

页数：5

共 50 条

[1] Progressively Discriminative Transfer Network for Cross-Corpus Speech Emotion Recognition
Lu, Cheng
Tang, Chuangao
Zhang, Jiacheng
Zong, Yuan
[J]. ENTROPY, 2022, 24 (08)
[2] Cross-Corpus Speech Emotion Recognition Based on Joint Transfer Subspace Learning and Regression
Zhang, Weijian
Song, Peng
Chen, Dongliang
Sheng, Chao
Zhang, Wenjing
[J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 588 - 598
[3] Transferable discriminant linear regression for cross-corpus speech emotion recognition
Li, Shaokai
Song, Peng
Zhang, Wenjing
[J]. APPLIED ACOUSTICS, 2022, 197
[4] A CROSS-CORPUS STUDY ON SPEECH EMOTION RECOGNITION
Milner, Rosanna
Jalal, Md Asif
Ng, Raymond W. M.
Hain, Thomas
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 304 - 311
[5] Deep Cross-Corpus Speech Emotion Recognition: Recent Advances and Perspectives
Zhang, Shiqing
Liu, Ruixin
Tao, Xin
Zhao, Xiaoming
[J]. FRONTIERS IN NEUROROBOTICS, 2021, 15
[6] Analysis of Deep Learning Architectures for Cross-corpus Speech Emotion Recognition
Parry, Jack
Palaz, Dimitri
Clarke, Georgia
Lecomte, Pauline
Mead, Rebecca
Berger, Michael
Hofer, Gregor
[J]. INTERSPEECH 2019, 2019, : 1656 - 1660
[7] Transfer Linear Subspace Learning for Cross-Corpus Speech Emotion Recognition
Song, Peng
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (02) : 265 - 275
[8] DOMAIN GENERALIZATION WITH TRIPLET NETWORK FOR CROSS-CORPUS SPEECH EMOTION RECOGNITION
Lee, Shi-wook
[J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 389 - 396
[9] Transfer Subspace Learning for Unsupervised Cross-Corpus Speech Emotion Recognition
Liu, Na
Zhang, Baofeng
Liu, Bin
Shi, Jingang
Yang, Lei
Li, Zhiwei
Zhu, Junchao
[J]. IEEE ACCESS, 2021, 9 : 95925 - 95937
[10] Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition
Lee, Shi-Wook
[J]. 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings, 2021, : 389 - 396

← 1 2 3 4 5 →