Speech emotion recognition based on multi-feature speed rate and LSTM

被引:4
|
作者
Yang, Zijun [1 ]
Li, Zhen [1 ]
Zhou, Shi [2 ]
Zhang, Lifeng [1 ]
Serikawa, Seiichi [1 ]
机构
[1] Kyushu Inst Technol, 1-1 Sensuicho,Tobata Ward, Kitakyushu, Fukuoka 8040011, Japan
[2] Huzhou Univ, 759,East 2nd Rd, Huzhou 313000, Zhejiang, Peoples R China
关键词
Speech emotion recognition; LSTM; Voiced sound; Phonogram; Short-time features; DEPRESSION; SEVERITY; SIGNALS;
D O I
10.1016/j.neucom.2024.128177
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Correctly recognizing speech emotions is of significant importance in various fields, such as healthcare and human-computer interaction (HCI). However, the complexity of speech signal features poses challenges for speech emotion recognition. This study introduces a novel multi-feature method for speech emotion recognition that combines short-and rhythmic features. Utilizing short-time energy, zero-crossing rate, and average amplitude difference, the proposed approach effectively addressed overfitting concerns by reducing feature dimensionality. Employing an (LSTM) network, the experiment achieved notable accuracy across diverse datasets. Specifically, the proposed method achieved an impressive accuracy of up to 98.47% on the CASIA dataset, 100% on the Emo-DB dataset, and 98.87% on the EMOVO dataset, demonstrating its capability to accurately discern speaker emotions across different languages and emotion classes. These findings underscore the significance of incorporating speech rate for emotional content recognition, which holds promise for application in HCI and auxiliary medical diagnostics.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Multi-feature Fusion Speech Emotion Recognition Based on SVM
    Zeng, Xiaoping
    Dong, Li
    Chen, Guanghui
    Dong, Qi
    PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, : 77 - 80
  • [2] Speech emotion recognition based on multi-feature and multi-lingual fusion
    Wang, Chunyi
    Ren, Ying
    Zhang, Na
    Cui, Fuwei
    Luo, Shiying
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (04) : 4897 - 4907
  • [3] Graph-Based Multi-Feature Fusion Method for Speech Emotion Recognition
    Liu, Xueyu
    Lin, Jie
    Wang, Chao
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (16)
  • [4] A Multi-Feature Multi-Classifier System for Speech Emotion Recognition
    Li, Pengcheng
    Song, Yan
    Wang, Peisen
    Dai, Lirong
    2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [5] Multi-Task Conformer with Multi-Feature Combination for Speech Emotion Recognition
    Seo, Jiyoung
    Lee, Bowon
    SYMMETRY-BASEL, 2022, 14 (07):
  • [6] Multi-Feature Based Emotion Recognition for Video Clips
    Liu, Chuanhe
    Tang, Tianhao
    Lv, Kui
    Wang, Minghao
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 630 - 634
  • [7] MULTI-FEATURE FUSION EMOTION RECOGNITION BASED ON RESTING EEG
    Zhang, Jun-An
    Gu, Liping
    Chen, Yongqiang
    Zhu, Geng
    Ou, Lang
    Wang, Liyan
    Li, Xiaoou
    Zhong, Lichang
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2022, 22 (03)
  • [8] Speech emotion recognition based on multi‐feature and multi‐lingual fusion
    Chunyi Wang
    Ying Ren
    Na Zhang
    Fuwei Cui
    Shiying Luo
    Multimedia Tools and Applications, 2022, 81 : 4897 - 4907
  • [9] Speech Emotion Recognition Based on Multi Acoustic Feature Fusion
    Xiang, Shanshan
    Anwer, Sadiyagul
    Yilahun, Hankiz
    Hamdulla, Askar
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 338 - 346
  • [10] A Multi-Feature Fusion Speech Emotion Recognition Method Based on Frequency Band Division and Improved Residual Network
    Guo, Yi
    Zhou, Yongping
    Xiong, Xuejun
    Jiang, Xin
    Tian, Hanbing
    Zhang, Qianxue
    IEEE ACCESS, 2023, 11 : 86013 - 86024