Long-Time Speech Emotion Recognition Using Feature Compensation and Accentuation-Based Fusion

被引:1
|
作者
Sun, Jiu [1 ]
Zhu, Jinxin [1 ]
Shao, Jun [1 ]
机构
[1] Yancheng Inst Technol, Sch Informat Technol, Yancheng 224051, Peoples R China
关键词
Speech emotion recognition; Feature compensation; Long-time emotion recognition; Accentuation-based fusion;
D O I
10.1007/s00034-023-02480-6
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we study the speech emotion feature optimization using stochastic optimization algorithms, and feature compensation using deep neural networks. We also proposed to use accentuation-based fusion for long-time speech emotion recognition. Firstly, the extraction method of emotional features is studied, and a series of speech features are constructed for the recognition of emotion. Secondly, we propose a method of sample adaptation through denoising autoencoder to enhance the versatility of features through the mapping of sample features to improve adaptive ability. Thirdly, GA and SFLA are used to optimize the combination of features to improve the emotion recognition results at the utterance level. Finally, we use transformer model to implement accentuation-based emotion fusion in long-time speech. The continuous long-time speech corpus, as well as the public available EMO-DB, are used for experiments. Results show that the proposed method can effectively improve the performance of long-time speech emotion recognition.
引用
收藏
页码:916 / 940
页数:25
相关论文
共 50 条
  • [1] Long-Time Speech Emotion Recognition Using Feature Compensation and Accentuation-Based Fusion
    Jiu Sun
    Jinxin Zhu
    Jun Shao
    Circuits, Systems, and Signal Processing, 2024, 43 : 916 - 940
  • [2] Speech Emotion Recognition Based on Feature Fusion
    Shen, Qi
    Chen, Guanggen
    Chang, Lin
    PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON MATERIALS SCIENCE, MACHINERY AND ENERGY ENGINEERING (MSMEE 2017), 2017, 123 : 1071 - 1074
  • [3] Speech Emotion Recognition based on Multiple Feature Fusion
    Jiang, Changjiang
    Mao, Rong
    Liu, Geng
    Wang, Mingyi
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 907 - 912
  • [4] Feature Fusion of Speech Emotion Recognition Based on Deep Learning
    Liu, Gang
    He, Wei
    Jin, Bicheng
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 193 - 197
  • [5] Speech Emotion Recognition Based on Multi Acoustic Feature Fusion
    Xiang, Shanshan
    Anwer, Sadiyagul
    Yilahun, Hankiz
    Hamdulla, Askar
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 338 - 346
  • [6] Speech emotion recognition based on multimodal and multiscale feature fusion
    Hu, Huangshui
    Wei, Jie
    Sun, Hongyu
    Wang, Chuhang
    Tao, Shuo
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [7] Speech emotion recognition based on time domain feature
    Zhao, Lasheng
    Wei, Xiaopeng
    Zhang, Qiang
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE INFORMATION COMPUTING AND AUTOMATION, VOLS 1-3, 2008, : 1319 - 1321
  • [8] An autoencoder-based feature level fusion for speech emotion recognition
    Peng Shixin
    Chen Kai
    Tian Tian
    Chen Jingying
    Digital Communications and Networks, 2024, 10 (05) : 1341 - 1351
  • [9] Speech emotion recognition based on multi‐feature and multi‐lingual fusion
    Chunyi Wang
    Ying Ren
    Na Zhang
    Fuwei Cui
    Shiying Luo
    Multimedia Tools and Applications, 2022, 81 : 4897 - 4907
  • [10] Multi-feature Fusion Speech Emotion Recognition Based on SVM
    Zeng, Xiaoping
    Dong, Li
    Chen, Guanghui
    Dong, Qi
    PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, : 77 - 80