Speech emotion recognition based on multi-feature speed rate and LSTM

被引：4

作者：

Yang, Zijun ^{[1
]}

Li, Zhen ^{[1
]}

Zhou, Shi ^{[2
]}

Zhang, Lifeng ^{[1
]}

Serikawa, Seiichi ^{[1
]}

机构：

[1] Kyushu Inst Technol, 1-1 Sensuicho,Tobata Ward, Kitakyushu, Fukuoka 8040011, Japan

[2] Huzhou Univ, 759,East 2nd Rd, Huzhou 313000, Zhejiang, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 601卷

关键词：

Speech emotion recognition; LSTM; Voiced sound; Phonogram; Short-time features; DEPRESSION; SEVERITY; SIGNALS;

D O I：

10.1016/j.neucom.2024.128177

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Correctly recognizing speech emotions is of significant importance in various fields, such as healthcare and human-computer interaction (HCI). However, the complexity of speech signal features poses challenges for speech emotion recognition. This study introduces a novel multi-feature method for speech emotion recognition that combines short-and rhythmic features. Utilizing short-time energy, zero-crossing rate, and average amplitude difference, the proposed approach effectively addressed overfitting concerns by reducing feature dimensionality. Employing an (LSTM) network, the experiment achieved notable accuracy across diverse datasets. Specifically, the proposed method achieved an impressive accuracy of up to 98.47% on the CASIA dataset, 100% on the Emo-DB dataset, and 98.87% on the EMOVO dataset, demonstrating its capability to accurately discern speaker emotions across different languages and emotion classes. These findings underscore the significance of incorporating speech rate for emotional content recognition, which holds promise for application in HCI and auxiliary medical diagnostics.

引用

页数：12

共 50 条

[1] Multi-feature Fusion Speech Emotion Recognition Based on SVM
Zeng, Xiaoping
Dong, Li
Chen, Guanghui
Dong, Qi
PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, : 77 - 80
[2] Speech emotion recognition based on multi-feature and multi-lingual fusion
Wang, Chunyi
Ren, Ying
Zhang, Na
Cui, Fuwei
Luo, Shiying
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (04) : 4897 - 4907
[3] Graph-Based Multi-Feature Fusion Method for Speech Emotion Recognition
Liu, Xueyu
Lin, Jie
Wang, Chao
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (16)
[4] A Multi-Feature Multi-Classifier System for Speech Emotion Recognition
Li, Pengcheng
Song, Yan
Wang, Peisen
Dai, Lirong
2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
[5] Multi-Task Conformer with Multi-Feature Combination for Speech Emotion Recognition
Seo, Jiyoung
Lee, Bowon
SYMMETRY-BASEL, 2022, 14 (07):
[6] Multi-Feature Based Emotion Recognition for Video Clips
Liu, Chuanhe
Tang, Tianhao
Lv, Kui
Wang, Minghao
ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 630 - 634
[7] MULTI-FEATURE FUSION EMOTION RECOGNITION BASED ON RESTING EEG
Zhang, Jun-An
Gu, Liping
Chen, Yongqiang
Zhu, Geng
Ou, Lang
Wang, Liyan
Li, Xiaoou
Zhong, Lichang
JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2022, 22 (03)
[8] Speech emotion recognition based on multi‐feature and multi‐lingual fusion
Chunyi Wang
Ying Ren
Na Zhang
Fuwei Cui
Shiying Luo
Multimedia Tools and Applications, 2022, 81 : 4897 - 4907
[9] Speech Emotion Recognition Based on Multi Acoustic Feature Fusion
Xiang, Shanshan
Anwer, Sadiyagul
Yilahun, Hankiz
Hamdulla, Askar
MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 338 - 346
[10] A Multi-Feature Fusion Speech Emotion Recognition Method Based on Frequency Band Division and Improved Residual Network
Guo, Yi
Zhou, Yongping
Xiong, Xuejun
Jiang, Xin
Tian, Hanbing
Zhang, Qianxue
IEEE ACCESS, 2023, 11 : 86013 - 86024

← 1 2 3 4 5 →