Speech emotion recognition based on multi-feature speed rate and LSTM

被引:4
|
作者
Yang, Zijun [1 ]
Li, Zhen [1 ]
Zhou, Shi [2 ]
Zhang, Lifeng [1 ]
Serikawa, Seiichi [1 ]
机构
[1] Kyushu Inst Technol, 1-1 Sensuicho,Tobata Ward, Kitakyushu, Fukuoka 8040011, Japan
[2] Huzhou Univ, 759,East 2nd Rd, Huzhou 313000, Zhejiang, Peoples R China
关键词
Speech emotion recognition; LSTM; Voiced sound; Phonogram; Short-time features; DEPRESSION; SEVERITY; SIGNALS;
D O I
10.1016/j.neucom.2024.128177
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Correctly recognizing speech emotions is of significant importance in various fields, such as healthcare and human-computer interaction (HCI). However, the complexity of speech signal features poses challenges for speech emotion recognition. This study introduces a novel multi-feature method for speech emotion recognition that combines short-and rhythmic features. Utilizing short-time energy, zero-crossing rate, and average amplitude difference, the proposed approach effectively addressed overfitting concerns by reducing feature dimensionality. Employing an (LSTM) network, the experiment achieved notable accuracy across diverse datasets. Specifically, the proposed method achieved an impressive accuracy of up to 98.47% on the CASIA dataset, 100% on the Emo-DB dataset, and 98.87% on the EMOVO dataset, demonstrating its capability to accurately discern speaker emotions across different languages and emotion classes. These findings underscore the significance of incorporating speech rate for emotional content recognition, which holds promise for application in HCI and auxiliary medical diagnostics.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Feature representation for speech emotion Recognition
    Abdollahpour, Mehdi
    Zamani, Lafar
    Rad, Hamidreza Saligheh
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1465 - 1468
  • [42] A Speech Steganalysis Algorithm Based on Multi-Feature Fusion and BiLSTM
    Su Z.-P.
    Zhang L.
    Zhang G.-F.
    Yue F.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (05): : 1300 - 1309
  • [43] Emotion Recognition Using Multi-parameter Speech Feature Classification
    Poorna, S. S.
    Jeevitha, C. Y.
    Nair, Shyama Jayan
    Santhosh, Sini
    Nair, G. J.
    2015 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, AND SYSTEMS (ICCCS), 2015, : 217 - 222
  • [44] Business Brand Research Based on Multi-Feature Fusion Emotion Analysis
    Li, Boxuan
    FRONTIERS IN PSYCHOLOGY, 2022, 13
  • [45] An Empirical Experiment on Feature Extractions Based for Speech Emotion Recognition
    Binh Van Duong
    Chien Nhu Ha
    Nguyen, Trung T.
    Phuc Nguyen
    Trong-Hop Do
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT II, 2022, 13758 : 180 - 191
  • [46] Unsupervised Feature Learning for Speech Emotion Recognition Based on Autoencoder
    Ying, Yangwei
    Tu, Yuanwu
    Zhou, Hong
    ELECTRONICS, 2021, 10 (17)
  • [47] Speech emotion recognition based on multimodal and multiscale feature fusion
    Hu, Huangshui
    Wei, Jie
    Sun, Hongyu
    Wang, Chuhang
    Tao, Shuo
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [48] Feature Fusion of Speech Emotion Recognition Based on Deep Learning
    Liu, Gang
    He, Wei
    Jin, Bicheng
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 193 - 197
  • [49] Multi-stream Attention-based BLSTM with Feature Segmentation for Speech Emotion Recognition
    Chiba, Yuya
    Nose, Takashi
    Ito, Akinori
    INTERSPEECH 2020, 2020, : 3301 - 3305
  • [50] MULTI-OBJECTIVE HEURISTIC FEATURE SELECTION FOR SPEECH-BASED MULTILINGUAL EMOTION RECOGNITION
    Brester, Christina
    Semenkin, Eugene
    Sidorov, Maxim
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2016, 6 (04) : 243 - 253