Speech emotion recognition based on multi-feature speed rate and LSTM

被引：4

作者：

Yang, Zijun ^{[1
]}

Li, Zhen ^{[1
]}

Zhou, Shi ^{[2
]}

Zhang, Lifeng ^{[1
]}

Serikawa, Seiichi ^{[1
]}

机构：

[1] Kyushu Inst Technol, 1-1 Sensuicho,Tobata Ward, Kitakyushu, Fukuoka 8040011, Japan

[2] Huzhou Univ, 759,East 2nd Rd, Huzhou 313000, Zhejiang, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 601卷

关键词：

Speech emotion recognition; LSTM; Voiced sound; Phonogram; Short-time features; DEPRESSION; SEVERITY; SIGNALS;

D O I：

10.1016/j.neucom.2024.128177

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Correctly recognizing speech emotions is of significant importance in various fields, such as healthcare and human-computer interaction (HCI). However, the complexity of speech signal features poses challenges for speech emotion recognition. This study introduces a novel multi-feature method for speech emotion recognition that combines short-and rhythmic features. Utilizing short-time energy, zero-crossing rate, and average amplitude difference, the proposed approach effectively addressed overfitting concerns by reducing feature dimensionality. Employing an (LSTM) network, the experiment achieved notable accuracy across diverse datasets. Specifically, the proposed method achieved an impressive accuracy of up to 98.47% on the CASIA dataset, 100% on the Emo-DB dataset, and 98.87% on the EMOVO dataset, demonstrating its capability to accurately discern speaker emotions across different languages and emotion classes. These findings underscore the significance of incorporating speech rate for emotional content recognition, which holds promise for application in HCI and auxiliary medical diagnostics.

引用

页数：12

共 50 条

[41] Feature representation for speech emotion Recognition
Abdollahpour, Mehdi
Zamani, Lafar
Rad, Hamidreza Saligheh
2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1465 - 1468
[42] A Speech Steganalysis Algorithm Based on Multi-Feature Fusion and BiLSTM
Su Z.-P.
Zhang L.
Zhang G.-F.
Yue F.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (05): : 1300 - 1309
[43] Emotion Recognition Using Multi-parameter Speech Feature Classification
Poorna, S. S.
Jeevitha, C. Y.
Nair, Shyama Jayan
Santhosh, Sini
Nair, G. J.
2015 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, AND SYSTEMS (ICCCS), 2015, : 217 - 222
[44] Business Brand Research Based on Multi-Feature Fusion Emotion Analysis
Li, Boxuan
FRONTIERS IN PSYCHOLOGY, 2022, 13
[45] An Empirical Experiment on Feature Extractions Based for Speech Emotion Recognition
Binh Van Duong
Chien Nhu Ha
Nguyen, Trung T.
Phuc Nguyen
Trong-Hop Do
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT II, 2022, 13758 : 180 - 191
[46] Unsupervised Feature Learning for Speech Emotion Recognition Based on Autoencoder
Ying, Yangwei
Tu, Yuanwu
Zhou, Hong
ELECTRONICS, 2021, 10 (17)
[47] Speech emotion recognition based on multimodal and multiscale feature fusion
Hu, Huangshui
Wei, Jie
Sun, Hongyu
Wang, Chuhang
Tao, Shuo
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
[48] Feature Fusion of Speech Emotion Recognition Based on Deep Learning
Liu, Gang
He, Wei
Jin, Bicheng
PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 193 - 197
[49] Multi-stream Attention-based BLSTM with Feature Segmentation for Speech Emotion Recognition
Chiba, Yuya
Nose, Takashi
Ito, Akinori
INTERSPEECH 2020, 2020, : 3301 - 3305
[50] MULTI-OBJECTIVE HEURISTIC FEATURE SELECTION FOR SPEECH-BASED MULTILINGUAL EMOTION RECOGNITION
Brester, Christina
Semenkin, Eugene
Sidorov, Maxim
JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2016, 6 (04) : 243 - 253

← 1 2 3 4 5 →