Speech emotion recognition based on Bi-directional LSTM architecture and deep belief networks

被引:14
|
作者
Senthilkumar, N. [1 ]
Karpakam, S. [2 ]
Devi, M. Gayathri [3 ]
Balakumaresan, R. [4 ]
Dhilipkumar, P. [5 ]
机构
[1] Dr NGP Inst Technol, Dept ECE, Coimbatore, India
[2] Sri Eshwar Coll Engn, Dept ECE, Coimbatore, India
[3] Muthayammal Engn Coll, Departmentof Biomed Engn, Rasipuram, India
[4] PSNA Coll Engn & Technol, Dept ECE, Dindigul, India
[5] Sri Ranganathar Inst Engn & Technol, Dept ECE, Coimbatore, India
关键词
CNN; Deep learning; Emotion; LSTM; Radial basis function; Sequence selection;
D O I
10.1016/j.matpr.2021.12.246
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Machine learning algorithms are often not able to recognize the speech emotion of the individuals. The Speech Emotion Recognition (SER) plays a major role in real-time applications that involve analyzing the speech emotions. It can be used in various scenarios such as emergency centers and human behavior assessments. In this work, we design the architecture for analyzing similarity in clusters, which is based on a key sequence selection procedure. A sequence of information is transformed into a spectrogram with the advantage of the STRFT algorithm. The subsequent result is a discriminative and salient feature extraction program. We have also added new features to the CNN to improve its recognition performance. Instead of the whole utterance, the key segments are processed separately to diminish the structure complexity. The proposed system is compared to different standard datasets for recognizing different kinds of objects. It is evaluated over different time periods and achieves better recognition accuracy. The proposed SER model is proven to be robust and reliable when compared with latest state-of-the-art methods. Copyright (c) 2022 Elsevier Ltd. All rights reserved. Selection and peer-review under responsibility of the scientific committee of the International Conference on Innovation and Application in Science and Technology.
引用
收藏
页码:2180 / 2184
页数:5
相关论文
共 50 条
  • [21] SPEECH EMOTION RECOGNITION WITH DUAL-SEQUENCE LSTM ARCHITECTURE
    Wang, Jianyou
    Xue, Michael
    Culhane, Ryan
    Diao, Enmao
    Ding, Jie
    Tarokh, Vahid
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6474 - 6478
  • [22] Improving Speech Emotion Recognition Using Graph Attentive Bi-directional Gated Recurrent Unit Network
    Su, Bo-Hao
    Chang, Chun-Min
    Lin, Yun-Shao
    Lee, Chi-Chun
    INTERSPEECH 2020, 2020, : 506 - 510
  • [23] Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition
    Harby, Fatma
    Alohali, Mansor
    Thaljaoui, Adel
    Talaat, Amira Samy
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 78 (02): : 2689 - 2719
  • [24] IMPROVED KNOWLEDGE DISTILLATION FROM BI-DIRECTIONAL TO UNI-DIRECTIONAL LSTM CTC FOR END-TO-END SPEECH RECOGNITION
    Kurata, Gakuto
    Audhkhasi, Kartik
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 411 - 417
  • [25] Development of Speech Emotion Recognition System using Deep Belief Networks in Malayalam language
    Chandran, Athira
    Pravena, D.
    Govind, D.
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 676 - 680
  • [26] Improving Phonetic Recognition with Sequence-length Standardized MFCC Features and Deep Bi-Directional LSTM
    Toan Pham Van
    Hau Nguyen Thanh
    Ta Minh Thanh
    PROCEEDINGS OF 2018 5TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS 2018), 2018, : 322 - 325
  • [27] Evaluation of Speech-to-Gesture Generation Using Bi-Directional LSTM Network
    Hasegawa, Dai
    Kaneko, Naoshi
    Shirakawa, Shinichi
    Sakuta, Hiroshi
    Sumi, Kazuhiko
    18TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA'18), 2018, : 79 - 86
  • [28] Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks
    Zhao, Rui
    Yan, Ruqiang
    Wang, Jinjiang
    Mao, Kezhi
    SENSORS, 2017, 17 (02)
  • [29] ADAPTIVE CONVOLUTIONALLY ENCHANCED BI-DIRECTIONAL LSTM NETWORKS FOR CHOREOGRAPHIC MODELING
    Bakalos, Nikolaos
    Rallis, Ioannis
    Doulamis, Nikolaos
    Doulamis, Anastasios
    Voulodimos, Athanasios
    Protopapadakis, Eftychios
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1826 - 1830
  • [30] DB-LSTM: Densely-connected Bi-directional LSTM for human action recognition
    He, Jun-Yan
    Wu, Xiao
    Cheng, Zhi-Qi
    Yuan, Zhaoquan
    Jiang, Yu-Gang
    NEUROCOMPUTING, 2021, 444 : 319 - 331