DOMAIN GENERALIZATION WITH TRIPLET NETWORK FOR CROSS-CORPUS SPEECH EMOTION RECOGNITION

被引:8
|
作者
Lee, Shi-wook [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
Speech emotion recognition; cross-corpus; domain generalization; triplet network; ADVERSARIAL;
D O I
10.1109/SLT48900.2021.9383534
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain generalization is a major challenge for cross-corpus speech emotion recognition. The recognition performance built on "seen" source corpora is inevitably degraded when the systems are tested against "unseen" target corpora that have different speakers, channels, and languages. We present a novel framework based on a triplet network to learn more generalized features of emotional speech that are invariant across multiple corpora. To reduce the intrinsic discrepancies between source and target corpora, an explicit feature transformation based on the triplet network is implemented as a preprocessing step. Extensive comparison experiments are carried out on three emotional speech corpora; two English corpora, and one Japanese corpus. Remarkable improvements of up-to 35.61% are achieved for all cross-corpus speech emotion recognition, and we show that the proposed framework using the triplet network is effective for obtaining more generalized features across multiple emotional speech corpora.
引用
收藏
页码:389 / 396
页数:8
相关论文
共 50 条
  • [1] Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition
    Lee, Shi-Wook
    [J]. 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings, 2021, : 389 - 396
  • [2] Improving Cross-Corpus Speech Emotion Recognition with Adversarial Discriminative Domain Generalization (ADDoG)
    Gideon, John
    McInnis, Melvin G.
    Provost, Emily Mower
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2021, 12 (04) : 1055 - 1068
  • [3] Adversarial Domain Generalized Transformer for Cross-Corpus Speech Emotion Recognition
    Gao, Yuan
    Wang, Longbiao
    Liu, Jiaxing
    Dang, Jianwu
    Okada, Shogo
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 697 - 708
  • [4] A CROSS-CORPUS STUDY ON SPEECH EMOTION RECOGNITION
    Milner, Rosanna
    Jalal, Md Asif
    Ng, Raymond W. M.
    Hain, Thomas
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 304 - 311
  • [6] Progressively Discriminative Transfer Network for Cross-Corpus Speech Emotion Recognition
    Lu, Cheng
    Tang, Chuangao
    Zhang, Jiacheng
    Zong, Yuan
    [J]. ENTROPY, 2022, 24 (08)
  • [7] Cross-corpus speech emotion recognition using subspace learning and domain adaption
    Xuan Cao
    Maoshen Jia
    Jiawei Ru
    Tun-wen Pai
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2022
  • [8] Cross-corpus speech emotion recognition using subspace learning and domain adaption
    Cao, Xuan
    Jia, Maoshen
    Ru, Jiawei
    Pai, Tun-wen
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [9] Deep Transductive Transfer Regression Network for Cross-Corpus Speech Emotion Recognition
    Zhao, Yan
    Wang, Jincen
    Ye, Ru
    Zong, Yuan
    Zheng, Wenming
    Zhao, Li
    [J]. INTERSPEECH 2022, 2022, : 371 - 375
  • [10] Cross-Corpus Speech Emotion Recognition Based on Deep Domain-Adaptive Convolutional Neural Network
    Liu, Jiateng
    Zheng, Wenming
    Zong, Yuan
    Lu, Cheng
    Tang, Chuangao
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (02) : 459 - 463