DOMAIN GENERALIZATION WITH TRIPLET NETWORK FOR CROSS-CORPUS SPEECH EMOTION RECOGNITION

被引：8

作者：

Lee, Shi-wook ^{[1
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan

来源：

2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT) | 2021年

关键词：

Speech emotion recognition; cross-corpus; domain generalization; triplet network; ADVERSARIAL;

D O I：

10.1109/SLT48900.2021.9383534

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Domain generalization is a major challenge for cross-corpus speech emotion recognition. The recognition performance built on "seen" source corpora is inevitably degraded when the systems are tested against "unseen" target corpora that have different speakers, channels, and languages. We present a novel framework based on a triplet network to learn more generalized features of emotional speech that are invariant across multiple corpora. To reduce the intrinsic discrepancies between source and target corpora, an explicit feature transformation based on the triplet network is implemented as a preprocessing step. Extensive comparison experiments are carried out on three emotional speech corpora; two English corpora, and one Japanese corpus. Remarkable improvements of up-to 35.61% are achieved for all cross-corpus speech emotion recognition, and we show that the proposed framework using the triplet network is effective for obtaining more generalized features across multiple emotional speech corpora.

引用

页码：389 / 396

页数：8

共 50 条

[1] Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition
Lee, Shi-Wook
[J]. 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings, 2021, : 389 - 396
[2] Improving Cross-Corpus Speech Emotion Recognition with Adversarial Discriminative Domain Generalization (ADDoG)
Gideon, John
McInnis, Melvin G.
Provost, Emily Mower
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2021, 12 (04) : 1055 - 1068
[3] Adversarial Domain Generalized Transformer for Cross-Corpus Speech Emotion Recognition
Gao, Yuan
Wang, Longbiao
Liu, Jiaxing
Dang, Jianwu
Okada, Shogo
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 697 - 708
[4] A CROSS-CORPUS STUDY ON SPEECH EMOTION RECOGNITION
Milner, Rosanna
Jalal, Md Asif
Ng, Raymond W. M.
Hain, Thomas
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 304 - 311
[5] Cross-corpus speech emotion recognition using semi-supervised domain adaptation network
[J]. Jia, Maoshen (jiamaoshen@bjut.edu.cn), 2025, 168
[6] Progressively Discriminative Transfer Network for Cross-Corpus Speech Emotion Recognition
Lu, Cheng
Tang, Chuangao
Zhang, Jiacheng
Zong, Yuan
[J]. ENTROPY, 2022, 24 (08)
[7] Cross-corpus speech emotion recognition using subspace learning and domain adaption
Xuan Cao
Maoshen Jia
Jiawei Ru
Tun-wen Pai
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2022
[8] Cross-corpus speech emotion recognition using subspace learning and domain adaption
Cao, Xuan
Jia, Maoshen
Ru, Jiawei
Pai, Tun-wen
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
[9] Deep Transductive Transfer Regression Network for Cross-Corpus Speech Emotion Recognition
Zhao, Yan
Wang, Jincen
Ye, Ru
Zong, Yuan
Zheng, Wenming
Zhao, Li
[J]. INTERSPEECH 2022, 2022, : 371 - 375
[10] Cross-Corpus Speech Emotion Recognition Based on Deep Domain-Adaptive Convolutional Neural Network
Liu, Jiateng
Zheng, Wenming
Zong, Yuan
Lu, Cheng
Tang, Chuangao
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (02) : 459 - 463

← 1 2 3 4 5 →