Long-Time Speech Emotion Recognition Using Feature Compensation and Accentuation-Based Fusion

被引:1
|
作者
Sun, Jiu [1 ]
Zhu, Jinxin [1 ]
Shao, Jun [1 ]
机构
[1] Yancheng Inst Technol, Sch Informat Technol, Yancheng 224051, Peoples R China
关键词
Speech emotion recognition; Feature compensation; Long-time emotion recognition; Accentuation-based fusion;
D O I
10.1007/s00034-023-02480-6
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we study the speech emotion feature optimization using stochastic optimization algorithms, and feature compensation using deep neural networks. We also proposed to use accentuation-based fusion for long-time speech emotion recognition. Firstly, the extraction method of emotional features is studied, and a series of speech features are constructed for the recognition of emotion. Secondly, we propose a method of sample adaptation through denoising autoencoder to enhance the versatility of features through the mapping of sample features to improve adaptive ability. Thirdly, GA and SFLA are used to optimize the combination of features to improve the emotion recognition results at the utterance level. Finally, we use transformer model to implement accentuation-based emotion fusion in long-time speech. The continuous long-time speech corpus, as well as the public available EMO-DB, are used for experiments. Results show that the proposed method can effectively improve the performance of long-time speech emotion recognition.
引用
收藏
页码:916 / 940
页数:25
相关论文
共 50 条
  • [41] Speech emotion recognition using a novel feature set
    Yang, J. (jsjyj0801@163.com), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [42] Speech Emotion Recognition using SVM with thresholding fusion
    Gupta, Shilpi
    Mehra, Anu
    Vinay
    2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN) 2015, 2015, : 570 - 574
  • [43] Speech emotion recognition based on multi-dimensional feature extraction and multi-scale feature fusion
    Yu, Lingli
    Xu, Fengjun
    Qu, Yundong
    Zhou, Kaijun
    APPLIED ACOUSTICS, 2024, 216
  • [44] An algorithm study for speech emotion recognition based speech feature analysis
    Zhengbiao, Ji
    Feng, Zhou
    Ming, Zhu
    International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (11): : 33 - 42
  • [45] Front-End Feature Compensation for Noise Robust Speech Emotion Recognition
    Pandharipande, Meghna
    Chakraborty, Rupayan
    Panda, Ashish
    Das, Biswajit
    Kopparapu, Sunil Kumar
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [46] Combined CNN LSTM with attention for speech emotion recognition based on feature-level fusion
    Liu Y.
    Chen A.
    Zhou G.
    Yi J.
    Xiang J.
    Wang Y.
    Multimedia Tools and Applications, 2024, 83 (21) : 59839 - 59859
  • [47] DOMAIN-ADVERSARIAL AUTOENCODER WITH ATTENTION BASED FEATURE LEVEL FUSION FOR SPEECH EMOTION RECOGNITION
    Gao, Yuan
    Liu, JiaXing
    Wang, Longbiao
    Dang, Jianwu
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6314 - 6318
  • [48] ANN based Decision Fusion for Speech Emotion Recognition
    Xu, Lu
    Xu, Mingxing
    Yang, Dali
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2003 - +
  • [49] MFCc-based feature extraction model for long time period emotion speech using cnn
    Alhlffee M.
    Revue d'Intelligence Artificielle, 2020, 34 (02): : 117 - 123
  • [50] Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition
    Masoud Geravanchizadeh
    Elnaz Forouhandeh
    Meysam Bashirpour
    EURASIP Journal on Audio, Speech, and Music Processing, 2021