Nonnegative Matrix Factorization Based Transfer Subspace Learning for Cross-Corpus Speech Emotion Recognition

被引:21
|
作者
Luo, Hui [1 ]
Han, Jiqing [2 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Comp Sci & Technol, Harbin 150001, Heilongjiang, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Heilongjiang, Peoples R China
基金
美国国家科学基金会;
关键词
Non-negative matrix factorization; transfer subspace learning; cross-corpus; speech emotion recognition; ALGORITHMS;
D O I
10.1109/TASLP.2020.3006331
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article focuses on the cross-corpus speech emotion recognition (SER) task. To overcome the problem that the distribution of training (source) samples is inconsistent with that of testing (target) samples, we propose a non-negative matrix factorization based transfer subspace learning method (NMFTSL). Our method tries to find a shared feature subspace for the source and target corpora, in which the discrepancy between the two distributions is eliminated as much as possible and their individual components are excluded, thus the knowledge of the source corpus can be transferred to the target corpus. Specifically, in this induced subspace, we minimize the distances not only between the marginal distributions but also between the conditional distributions, where both distances are measured by the maximum mean discrepancy criterion. To estimate the conditional distribution of the target corpus, we propose to integrate the prediction of target label and the learning of feature representation into a joint learning model. Meanwhile, we introduce a difference loss to exclude the individual components from the shared subspace, which can further reduce the mutual interference between the source and target individual components. Moreover, we propose a discrimination loss to introduce the labels into the shared subspace, which can improve the discrimination ability of the feature representation. We also provide the solution for the corresponding optimization problem. To evaluate the performance of our method, we construct 30 cross-corpus SER schemes using 6 popular speech emotion corpora. Experimental results show that our approach achieves better overall performance than state-of-the-art methods.
引用
下载
收藏
页码:2047 / 2060
页数:14
相关论文
共 50 条
  • [1] Cross-Corpus Speech Emotion Recognition Based on Sparse Subspace Transfer Learning
    Zhao, Keke
    Song, Peng
    Zhang, Wenjing
    Zhang, Weijian
    Li, Shaokai
    Chen, Dongliang
    Zheng, Wenming
    BIOMETRIC RECOGNITION (CCBR 2021), 2021, 12878 : 466 - 473
  • [2] Transfer Linear Subspace Learning for Cross-Corpus Speech Emotion Recognition
    Song, Peng
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (02) : 265 - 275
  • [3] Transfer Subspace Learning for Unsupervised Cross-Corpus Speech Emotion Recognition
    Liu, Na
    Zhang, Baofeng
    Liu, Bin
    Shi, Jingang
    Yang, Lei
    Li, Zhiwei
    Zhu, Junchao
    IEEE ACCESS, 2021, 9 : 95925 - 95937
  • [4] Cross-Corpus Speech Emotion Recognition Based on Joint Transfer Subspace Learning and Regression
    Zhang, Weijian
    Song, Peng
    Chen, Dongliang
    Sheng, Chao
    Zhang, Wenjing
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 588 - 598
  • [5] Transfer Sparse Discriminant Subspace Learning for Cross-Corpus Speech Emotion Recognition
    Zhang, Weijian
    Song, Peng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 307 - 318
  • [6] Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization
    Song, Peng
    Zheng, Wenming
    Ou, Shifeng
    Zhang, Xinran
    Jin, Yun
    Liu, Jinglei
    Yu, Yanwei
    SPEECH COMMUNICATION, 2016, 83 : 34 - 41
  • [7] Target-Adapted Subspace Learning for Cross-Corpus Speech Emotion Recognition
    Chen, Xiuzhen
    Zhou, Xiaoyan
    Lu, Cheng
    Zong, Yuan
    Zheng, Wenming
    Tang, Chuangao
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (12) : 2632 - 2636
  • [8] Cross-corpus speech emotion recognition using subspace learning and domain adaption
    Cao, Xuan
    Jia, Maoshen
    Ru, Jiawei
    Pai, Tun-wen
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [9] Cross-corpus speech emotion recognition using subspace learning and domain adaption
    Xuan Cao
    Maoshen Jia
    Jiawei Ru
    Tun-wen Pai
    EURASIP Journal on Audio, Speech, and Music Processing, 2022
  • [10] UNSUPERVISED CROSS-CORPUS SPEECH EMOTION RECOGNITION USING DOMAIN-ADAPTIVE SUBSPACE LEARNING
    Liu, Na
    Zong, Yuan
    Zhang, Baofeng
    Liu, Li
    Chen, Jie
    Zhao, Guoying
    Zhu, Junchao
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5144 - 5148