Speech emotion recognition based on multi-feature speed rate and LSTM

被引：4

作者：

Yang, Zijun ^{[1
]}

Li, Zhen ^{[1
]}

Zhou, Shi ^{[2
]}

Zhang, Lifeng ^{[1
]}

Serikawa, Seiichi ^{[1
]}

机构：

[1] Kyushu Inst Technol, 1-1 Sensuicho,Tobata Ward, Kitakyushu, Fukuoka 8040011, Japan

[2] Huzhou Univ, 759,East 2nd Rd, Huzhou 313000, Zhejiang, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 601卷

关键词：

Speech emotion recognition; LSTM; Voiced sound; Phonogram; Short-time features; DEPRESSION; SEVERITY; SIGNALS;

D O I：

10.1016/j.neucom.2024.128177

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Correctly recognizing speech emotions is of significant importance in various fields, such as healthcare and human-computer interaction (HCI). However, the complexity of speech signal features poses challenges for speech emotion recognition. This study introduces a novel multi-feature method for speech emotion recognition that combines short-and rhythmic features. Utilizing short-time energy, zero-crossing rate, and average amplitude difference, the proposed approach effectively addressed overfitting concerns by reducing feature dimensionality. Employing an (LSTM) network, the experiment achieved notable accuracy across diverse datasets. Specifically, the proposed method achieved an impressive accuracy of up to 98.47% on the CASIA dataset, 100% on the Emo-DB dataset, and 98.87% on the EMOVO dataset, demonstrating its capability to accurately discern speaker emotions across different languages and emotion classes. These findings underscore the significance of incorporating speech rate for emotional content recognition, which holds promise for application in HCI and auxiliary medical diagnostics.

引用

页数：12

共 50 条

[31] Speech emotion recognition based on time domain feature
Zhao, Lasheng
Wei, Xiaopeng
Zhang, Qiang
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE INFORMATION COMPUTING AND AUTOMATION, VOLS 1-3, 2008, : 1319 - 1321
[32] FRUIT RECOGNITION BASED ON MULTI-FEATURE AND MULTI-DECISION
Wang, Xiaohua
Huang, Wei
Jin, Chao
Hu, Min
Ren, Fuji
2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 113 - 117
[33] A multi-feature stock price prediction model based on multi-feature calculation, LASSO feature selection, and Ca-LSTM network
Chen, Xiao
Cao, Lei
Cao, Zhi
Zhang, Hongwei
CONNECTION SCIENCE, 2024, 36 (01)
[34] Multi-Feature based Hand-Gesture Recognition
Herath, H. M. S. P. B.
Ekanayake, M. P. B.
Godaliyadda, G. M. R. I.
Wijayakulasooriya, J. V.
2015 FIFTEENTH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER), 2015, : 63 - 68
[35] An algorithm study for speech emotion recognition based speech feature analysis
Zhengbiao, Ji
Feng, Zhou
Ming, Zhu
International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (11): : 33 - 42
[36] Multi-Feature Extraction of Pulmonary Nodules Based on LSTM and Attention Structure
Ni Y.
Yang Y.
Xie Z.
Zheng D.
Wang W.
Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2022, 56 (08): : 1078 - 1088
[37] Implicit Offensive Speech Detection Based on Multi-feature Fusion
Guo, Tengda
Lin, Lianxin
Liu, Hang
Zheng, Chengping
Tu, Zhijian
Wang, Haizhou
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023, 2023, 14118 : 27 - 38
[38] Speech emotion recognition based on Graph-LSTM neural network
Li, Yan
Wang, Yapeng
Yang, Xu
Im, Sio-Kei
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
[39] Speech emotion recognition based on Graph-LSTM neural network
Yan Li
Yapeng Wang
Xu Yang
Sio-Kei Im
EURASIP Journal on Audio, Speech, and Music Processing, 2023
[40] Research on Teacher Classroom Teaching Speech Emotion Recognition Based on LSTM
He, Yimin
Lu, Xiaoyong
Sun, Dan
Pan, Tao
Qiu, Yuqing
Liu, Jiahong
2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 326 - 331

← 1 2 3 4 5 →