Speech emotion recognition based on multi-feature speed rate and LSTM

被引:4
|
作者
Yang, Zijun [1 ]
Li, Zhen [1 ]
Zhou, Shi [2 ]
Zhang, Lifeng [1 ]
Serikawa, Seiichi [1 ]
机构
[1] Kyushu Inst Technol, 1-1 Sensuicho,Tobata Ward, Kitakyushu, Fukuoka 8040011, Japan
[2] Huzhou Univ, 759,East 2nd Rd, Huzhou 313000, Zhejiang, Peoples R China
关键词
Speech emotion recognition; LSTM; Voiced sound; Phonogram; Short-time features; DEPRESSION; SEVERITY; SIGNALS;
D O I
10.1016/j.neucom.2024.128177
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Correctly recognizing speech emotions is of significant importance in various fields, such as healthcare and human-computer interaction (HCI). However, the complexity of speech signal features poses challenges for speech emotion recognition. This study introduces a novel multi-feature method for speech emotion recognition that combines short-and rhythmic features. Utilizing short-time energy, zero-crossing rate, and average amplitude difference, the proposed approach effectively addressed overfitting concerns by reducing feature dimensionality. Employing an (LSTM) network, the experiment achieved notable accuracy across diverse datasets. Specifically, the proposed method achieved an impressive accuracy of up to 98.47% on the CASIA dataset, 100% on the Emo-DB dataset, and 98.87% on the EMOVO dataset, demonstrating its capability to accurately discern speaker emotions across different languages and emotion classes. These findings underscore the significance of incorporating speech rate for emotional content recognition, which holds promise for application in HCI and auxiliary medical diagnostics.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Speech emotion recognition based on time domain feature
    Zhao, Lasheng
    Wei, Xiaopeng
    Zhang, Qiang
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE INFORMATION COMPUTING AND AUTOMATION, VOLS 1-3, 2008, : 1319 - 1321
  • [32] FRUIT RECOGNITION BASED ON MULTI-FEATURE AND MULTI-DECISION
    Wang, Xiaohua
    Huang, Wei
    Jin, Chao
    Hu, Min
    Ren, Fuji
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 113 - 117
  • [33] A multi-feature stock price prediction model based on multi-feature calculation, LASSO feature selection, and Ca-LSTM network
    Chen, Xiao
    Cao, Lei
    Cao, Zhi
    Zhang, Hongwei
    CONNECTION SCIENCE, 2024, 36 (01)
  • [34] Multi-Feature based Hand-Gesture Recognition
    Herath, H. M. S. P. B.
    Ekanayake, M. P. B.
    Godaliyadda, G. M. R. I.
    Wijayakulasooriya, J. V.
    2015 FIFTEENTH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER), 2015, : 63 - 68
  • [35] An algorithm study for speech emotion recognition based speech feature analysis
    Zhengbiao, Ji
    Feng, Zhou
    Ming, Zhu
    International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (11): : 33 - 42
  • [36] Multi-Feature Extraction of Pulmonary Nodules Based on LSTM and Attention Structure
    Ni Y.
    Yang Y.
    Xie Z.
    Zheng D.
    Wang W.
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2022, 56 (08): : 1078 - 1088
  • [37] Implicit Offensive Speech Detection Based on Multi-feature Fusion
    Guo, Tengda
    Lin, Lianxin
    Liu, Hang
    Zheng, Chengping
    Tu, Zhijian
    Wang, Haizhou
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023, 2023, 14118 : 27 - 38
  • [38] Speech emotion recognition based on Graph-LSTM neural network
    Li, Yan
    Wang, Yapeng
    Yang, Xu
    Im, Sio-Kei
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [39] Speech emotion recognition based on Graph-LSTM neural network
    Yan Li
    Yapeng Wang
    Xu Yang
    Sio-Kei Im
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [40] Research on Teacher Classroom Teaching Speech Emotion Recognition Based on LSTM
    He, Yimin
    Lu, Xiaoyong
    Sun, Dan
    Pan, Tao
    Qiu, Yuqing
    Liu, Jiahong
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 326 - 331