EMOHRNET: HIGH-RESOLUTION NEURAL NETWORK BASED SPEECH EMOTION RECOGNITION

被引:0
|
作者
Muppidi, Akshay [1 ]
Radfar, Martin [1 ]
机构
[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
关键词
Speech emotion recognition; High Resolution Network; Frequency Masking; Time Masking;
D O I
10.1109/ICASSP48485.2024.10446976
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech emotion recognition (SER) is pivotal for enhancing human-machine interactions. This paper introduces "EmoHRNet", a novel adaptation of High-Resolution Networks (HRNet) tailored for SER. The HRNet structure is designed to maintain high-resolution representations from the initial to the final layers. By transforming audio samples into spectrograms, EmoHRNet leverages the HRNet architecture to extract high-level features. EmoHRNet's unique architecture maintains high-resolution representations throughout, capturing both granular and overarching emotional cues from speech signals. The model outperforms leading models, achieving accuracies of 92.45% on RAVDESS, 80.06% on IEMOCAP, and 92.77% on EMOVO. Thus, we show that EmoHRNet sets a new benchmark in the SER domain.
引用
收藏
页码:10881 / 10885
页数:5
相关论文
共 50 条
  • [41] Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    ELECTRONICS, 2023, 12 (04)
  • [42] Emotion Recognition Based on High-Resolution EEG Recordings and Reconstructed Brain Sources
    Becker, Hanna
    Fleureau, Julien
    Guillotel, Philippe
    Wendling, Fabrice
    Merlet, Isabelle
    Albera, Laurent
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2020, 11 (02) : 244 - 257
  • [43] Speech Emotion Recognition Based on Deep Belief Network
    Shi, Peng
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC), 2018,
  • [44] RETRACTED: A Classroom Emotion Recognition Model Based on a Convolutional Neural Network Speech Emotion Algorithm (Retracted Article)
    Yuan, Qinying
    OCCUPATIONAL THERAPY INTERNATIONAL, 2022, 2022
  • [45] High-level Feature Representation using Recurrent Neural Network for Speech Emotion Recognition
    Lee, Jinkyu
    Tashev, Ivan
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1537 - 1540
  • [46] Optimizing Speech Emotion Recognition with Hilbert Curve and convolutional neural network
    Yang, Zijun
    Zhou, Shi
    Zhang, Lifeng
    Serikawa, Seiichi
    Cognitive Robotics, 2024, 4 : 30 - 41
  • [47] Active Learning for Speech Emotion Recognition Using Deep Neural Network
    Abdelwahab, Mohammed
    Busso, Carlos
    2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [48] Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network
    Zisad, Sharif Noor
    Hossain, Mohammad Shahadat
    Andersson, Karl
    BRAIN INFORMATICS, BI 2020, 2020, 12241 : 287 - 296
  • [49] Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network
    Badshah, Abdul Malik
    Ahmad, Jamil
    Rahim, Nasir
    Baik, Sung Wook
    2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 125 - 129
  • [50] Convolution neural network with multiple pooling strategies for speech emotion recognition
    Jiang, Pengxu
    Zou, Cairong
    2022 6TH INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROL, ISCSIC, 2022, : 89 - 92