EMOHRNET: HIGH-RESOLUTION NEURAL NETWORK BASED SPEECH EMOTION RECOGNITION

被引：0

作者：

Muppidi, Akshay ^{[1
]}

Radfar, Martin ^{[1
]}

机构：

[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024) | 2024年

关键词：

Speech emotion recognition; High Resolution Network; Frequency Masking; Time Masking;

D O I：

10.1109/ICASSP48485.2024.10446976

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech emotion recognition (SER) is pivotal for enhancing human-machine interactions. This paper introduces "EmoHRNet", a novel adaptation of High-Resolution Networks (HRNet) tailored for SER. The HRNet structure is designed to maintain high-resolution representations from the initial to the final layers. By transforming audio samples into spectrograms, EmoHRNet leverages the HRNet architecture to extract high-level features. EmoHRNet's unique architecture maintains high-resolution representations throughout, capturing both granular and overarching emotional cues from speech signals. The model outperforms leading models, achieving accuracies of 92.45% on RAVDESS, 80.06% on IEMOCAP, and 92.77% on EMOVO. Thus, we show that EmoHRNet sets a new benchmark in the SER domain.

引用

页码：10881 / 10885

页数：5

共 50 条

[21] Neural network-based blended ensemble learning for speech emotion recognition
Yalamanchili, Bhanusree
Samayamantula, Srinivas Kumar
Anne, Koteswara Rao
MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2022, 33 (04) : 1323 - 1348
[22] Research on Speech Emotion Recognition Technology based on Deep and Shallow Neural Network
Wang, Jian
Han, Zhiyan
PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 3555 - 3558
[23] Neural network-based blended ensemble learning for speech emotion recognition
Bhanusree Yalamanchili
Srinivas Kumar Samayamantula
Koteswara Rao Anne
Multidimensional Systems and Signal Processing, 2022, 33 : 1323 - 1348
[24] Speech Emotion Recognition of Merged Features Based on Improved Convolutional Neural Network
Peng, Wangyue
Tang, Xiaoyu
2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 301 - 305
[25] Speech Emotion Recognition Based on Convolution Neural Network combined with Random Forest
Zheng, Li
Li, Qiao
Ban, Hua
Liu, Shuhua
PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 4143 - 4147
[26] Adaptive Artificial Neural Network Based Marathi Speech Database Emotion Recognition
Palange, Lalita Anil
Darekar, Raviraj Vishwambhar
TECHNO-SOCIETAL 2018: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SOCIETAL APPLICATIONS - VOL 2, 2020, : 59 - 67
[27] Speech Emotion Recognition System Based on BP Neural Network in Matlab Environment
Zhang, Guobao
Song, Qinghua
Fei, Shumin
ADVANCES IN NEURAL NETWORKS - ISNN 2008, PT 2, PROCEEDINGS, 2008, 5264 : 801 - 808
[28] Multicriteria Neural Network Design in the Speech-based Emotion Recognition Problem
Brester, Christina
Semenkin, Eugene
Sidorov, Maxim
Semenkina, Olga
ICIMCO 2015 PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL. 1, 2015, : 621 - 628
[29] Recurrent neural network for high-resolution radar ship target recognition
Wang, FX
Yu, WX
Guo, GR
ICR '96 - 1996 CIE INTERNATIONAL CONFERENCE OF RADAR, PROCEEDINGS, 1996, : 200 - 203
[30] A Study on Speech Emotion Recognition Using a Deep Neural Network
Lee, Kyong Hee
Choi, Hyun Kyun
Jang, Byung Tae
Kim, Do Hyun
2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1162 - 1165

← 1 2 3 4 5 →