Revolutionizing Speech Emotion Recognition: A Novel Hilbert Curve Approach for Two-Dimensional Representation and Convolutional Neural Network Classification

被引:0
|
作者
Tyagi, Suryakant [1 ]
Szenasi, Sandor [2 ,3 ]
机构
[1] Obuda Univ, Doctoral Sch Appl Informat & Appl Math, H-1034 Budapest, Hungary
[2] Obuda Univ, John Von Neumann Fac Informat, Budapest, Hungary
[3] J Selye Univ, Fac Econ & Informat, Komarno, Slovakia
关键词
Speech emotion recognition (SER); Hilbert curve; TESS; Gram angle fields; CyTex; FEATURES;
D O I
10.1007/978-3-031-59257-7_8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotions are integral to human existence, influencing psychological wellbeing and permeating various aspects of daily life. Speech emotion recognition (SER) stands as a pivotal branch of emotion detection, focusing on decoding the acoustic nuances embedded in speech signals. This study delves into the landscape of SER, addressing challenges related to feature extraction and classifier development. Inspired by the Hilbert curve, a novel approach is proposed, converting one-dimensional time series data into informative two-dimensional images. A convolutional neural network extracts features from these images, and a fully connected network processes these features for sentiment classification. The study comprehensively evaluates this method across four diverse datasets, namely RAVDESS, TESS, SAVEE, and EmoDB. The proposed algorithm demonstrates promising results, showcasing potential advantages in emotion recognition tasks. Comparative analyses with existing methodologies, including Gram Angle Fields (GAF) and CyTex, affirm the feasibility and effectiveness of the proposed algorithm. The study contributes to advancing sentiment recognition by transforming time-series data into two-dimensional images, thereby opening new avenues in speech emotion recognition with improved accuracy and performance. The paper outlines the algorithms employed, details the methodology, presents experimental results, and concludes with reflections on findings and potential future directions.
引用
收藏
页码:75 / 85
页数:11
相关论文
共 50 条
  • [41] Development and Analysis of Convolutional Neural Network based Accurate Speech Emotion Recognition Models
    Vijayan, Divya M.
    Arun, A., V
    Ganeshnath, R.
    Nath, Ajay S. A.
    Roy, Rajesh Cherian
    [J]. 2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [42] 3D Convolutional Recurrent Global Neural Network for Speech Emotion Recognition
    Zayene, Baraa
    Jlassi, Chiraz
    Arous, Najet
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP'2020), 2020,
  • [43] Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    [J]. ELECTRONICS, 2023, 12 (04)
  • [44] LIGHT-SERNET: A LIGHTWEIGHT FULLY CONVOLUTIONAL NEURAL NETWORK FOR SPEECH EMOTION RECOGNITION
    Aftab, Arya
    Morsali, Alireza
    Ghaemmaghami, Shahrokh
    Champagne, Benoit
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6912 - 6916
  • [45] Deep Convolutional Neural Network and Gray Wolf Optimization Algorithm for Speech Emotion Recognition
    Mohammad Reza Falahzadeh
    Fardad Farokhi
    Ali Harimi
    Reza Sabbaghi-Nadooshan
    [J]. Circuits, Systems, and Signal Processing, 2023, 42 : 449 - 492
  • [46] Speech Emotion Recognition by Combining Amplitude and Phase Information Using Convolutional Neural Network
    Guo, Lili
    Wang, Longbiao
    Dang, Jianwu
    Zhang, Linjuan
    Guan, Haotian
    Li, Xiangang
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1611 - 1615
  • [47] Time-Frequency Representation and Convolutional Neural Network-Based Emotion Recognition
    Khare, Smith K.
    Bajaj, Varun
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (07) : 2901 - 2909
  • [48] Two-Dimensional Convolutional Recurrent Neural Networks for Speech Activity Detection
    Vafeiadis, Anastasios
    Fanioudakis, Eleftherios
    Potamitis, Ilyas
    Votis, Konstantinos
    Giakoumis, Dimitrios
    Tzovaras, Dimitrios
    Chen, Liming
    Hamzaoui, Raouf
    [J]. INTERSPEECH 2019, 2019, : 2045 - 2049
  • [49] On the Generalizability of Two-dimensional Convolutional Neural Networks for Fake Speech Detection
    Papastergiopoulos, Christoforos
    Vafeiadis, Anastasios
    Papadimitriou, Ioannis
    Votis, Konstantinos
    Tzovaras, Dimitrios
    [J]. 1ST ACM INTERNATIONAL WORKSHOP ON MULTIMEDIA AI AGAINST DISINFORMATION, MAD 2022, 2022, : 3 - 9
  • [50] Hyperspectral image classification using convolutional neural network and two-dimensional complex Gabor transform
    Hanbay, Kazim
    [J]. JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2020, 35 (01): : 443 - 456