Self-labeling with feature transfer for speech emotion recognition

被引:11
|
作者
Wen, Guihua [1 ]
Liao, Huiqiang [1 ]
Li, Huihui [2 ]
Wen, Pengchen [3 ]
Zhang, Tong [1 ]
Gao, Sande [4 ]
Wang, Bao [4 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou, Peoples R China
[2] Guangdong Polytech Normal Univ, Sch Comp Sci, Guangzhou, Peoples R China
[3] Hubei Minzu Univ, Sch Informat Engn, Enshi, Hubei, Peoples R China
[4] Affiliated TCM Hosp Guangzhou Med Univ, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Speech emotion recognition; Deep neural network; Self-labeled; Speech frame; Transfer learning; REPRESENTATION;
D O I
10.1016/j.knosys.2022.109589
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most speech emotion recognition methods based on frames have obtained good results in many applications. However, they segment each speech sample into smaller frames that are labeled with the same emotional tag as that of the speech sample. This is inconsistent with the possibility of a speech sample containing several emotional categories at the same time. Thus, this paper proposes a self-labeling (SL) learning method for speech emotion recognition, which automatically segments each speech sample into frames and then labels them with the corresponding emotional tags, where the compatibility of these tags is also checked. Then, a time-frequency deep neural network for speech emotion recognition is designed and trained. As most speech emotion datasets are very small, the feature transfer model is applied to further enhance the performance of the SL learning method, which is trained on large-scale audio data. Experimental results on various datasets demonstrate the effectiveness of the proposed method. (C) 2022 Published by Elsevier B.V.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Harmony search for feature selection in speech emotion recognition
    Tao, Yongsen
    Wang, Kunxia
    Yang, Jing
    An, Ning
    Li, Lian
    2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2015, : 362 - 367
  • [32] Speech Emotion Recognition based on Multiple Feature Fusion
    Jiang, Changjiang
    Mao, Rong
    Liu, Geng
    Wang, Mingyi
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 907 - 912
  • [33] COMBINING FEATURE SELECTION AND REPRESENTATION FOR SPEECH EMOTION RECOGNITION
    Han, Wenjing
    Ruan, Huabin
    Yu, Xiaojie
    Zhu, Xuan
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2016,
  • [34] Speech emotion recognition based on time domain feature
    Zhao, Lasheng
    Wei, Xiaopeng
    Zhang, Qiang
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE INFORMATION COMPUTING AND AUTOMATION, VOLS 1-3, 2008, : 1319 - 1321
  • [35] A Salient Feature Extraction Algorithm for Speech Emotion Recognition
    Liang, Ruiyu
    Tao, Huawei
    Tang, Guichen
    Wang, Qingyun
    Zhao, Li
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (09): : 1715 - 1718
  • [36] Survey on discriminative feature selection for speech emotion recognition
    Xu, Xin
    Li, Ya
    Xu, Xiaoying
    Wen, Zhengqi
    Che, Hao
    Liu, Shanfeng
    Tao, Jianhua
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 345 - +
  • [37] Combining Self-labeling with Selective Sampling
    Kozal, Jedrzej
    Wozniak, Michal
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 70 - 79
  • [38] Outcomes of self-labeling sexual harassment
    Magley, VJ
    Hulin, CL
    Fitzgerald, LF
    DeNardo, M
    JOURNAL OF APPLIED PSYCHOLOGY, 1999, 84 (03) : 390 - 402
  • [39] Significance of TEO Slope Feature in Speech Emotion Recognition
    Drisya, P. S.
    Rajan, Rajeev
    2017 INTERNATIONAL CONFERENCE ON NETWORKS & ADVANCES IN COMPUTATIONAL TECHNOLOGIES (NETACT), 2017, : 438 - 441
  • [40] Feature fusion: research on emotion recognition in English speech
    Yang Y.
    International Journal of Speech Technology, 2024, 27 (02) : 319 - 327