EMOHRNET: HIGH-RESOLUTION NEURAL NETWORK BASED SPEECH EMOTION RECOGNITION

被引:0
|
作者
Muppidi, Akshay [1 ]
Radfar, Martin [1 ]
机构
[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
关键词
Speech emotion recognition; High Resolution Network; Frequency Masking; Time Masking;
D O I
10.1109/ICASSP48485.2024.10446976
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech emotion recognition (SER) is pivotal for enhancing human-machine interactions. This paper introduces "EmoHRNet", a novel adaptation of High-Resolution Networks (HRNet) tailored for SER. The HRNet structure is designed to maintain high-resolution representations from the initial to the final layers. By transforming audio samples into spectrograms, EmoHRNet leverages the HRNet architecture to extract high-level features. EmoHRNet's unique architecture maintains high-resolution representations throughout, capturing both granular and overarching emotional cues from speech signals. The model outperforms leading models, achieving accuracies of 92.45% on RAVDESS, 80.06% on IEMOCAP, and 92.77% on EMOVO. Thus, we show that EmoHRNet sets a new benchmark in the SER domain.
引用
收藏
页码:10881 / 10885
页数:5
相关论文
共 50 条
  • [1] Speech Emotion Recognition Based on Deep Neural Network
    Zhu, Zijiang
    Hu, Yi
    Li, Junshan
    Li, Jianjun
    Wang, Junhua
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 154 - 154
  • [2] Speech emotion recognition based on spiking neural network and convolutional neural network
    Du, Chengyan
    Liu, Fu
    Kang, Bing
    Hou, Tao
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
  • [3] Relative Speech Emotion Recognition Based Artificial Neural Network
    Fu, Liqin
    Mao, Xia
    Chen, Lijiang
    PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 1111 - 1115
  • [4] Fault High-Resolution Recognition Method Based on Deep Neural Network
    Feng C.
    Pan J.
    Li C.
    Yao Q.
    Liu J.
    Diqiu Kexue - Zhongguo Dizhi Daxue Xuebao/Earth Science - Journal of China University of Geosciences, 2023, 48 (08): : 3044 - 3052
  • [5] Speech Emotion Recognition based on Interactive Convolutional Neural Network
    Cheng, Huihui
    Tang, Xiaoyu
    2020 IEEE 3RD INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP 2020), 2020, : 163 - 167
  • [6] Speech Emotion Recognition with Hybrid Neural Network
    Wei, Chuanzheng
    Sun, Xiao
    Tian, Fang
    Ren, Fuji
    5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM 2019), 2019, : 298 - 302
  • [7] The Application of Capsule Neural Network Based CNN for Speech Emotion Recognition
    Wen, Xin-Cheng
    Liu, Kun-Hong
    Zhang, Wei-Ming
    Jiang, Kai
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9356 - 9362
  • [8] Speech Emotion Recognition Based on a Recurrent Neural Network Classification Model
    Fonnegra, Ruben D.
    Diaz, Gloria M.
    ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY, ACE 2017, 2018, 10714 : 882 - 892
  • [9] Speech emotion recognition based on Graph-LSTM neural network
    Li, Yan
    Wang, Yapeng
    Yang, Xu
    Im, Sio-Kei
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [10] Speech emotion recognition based on Graph-LSTM neural network
    Yan Li
    Yapeng Wang
    Xu Yang
    Sio-Kei Im
    EURASIP Journal on Audio, Speech, and Music Processing, 2023