Speech-Based Emotion Classification for Human by Introducing Upgraded Long Short-Term Memory (ULSTM)

被引:0
|
作者
Bhowmik, Subhrajit [1 ]
Chatterjee, Akshay [1 ]
Biswas, Sampurna [1 ]
Farhin, Reshmina [1 ]
Yasmin, Ghazaala [1 ]
机构
[1] St Thomas Coll Engn & Technol, Dept Comp Sci & Engn, 4 Diamond Harbour Rd, Kolkata 700023, India
关键词
Emotion recognition; Feature extraction; Classification model; Convolution neural network (CNN); Recurrent neural network (RNN); Gated recurrent unit (GRU); Long short-term memory (LSTM); Upgraded long short-term memory (ULSTM);
D O I
10.1007/978-981-15-2449-3_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
All humans have the intelligence for emotions through emotional behaviour by social skills and by interacting with and imitating humans. Not only that, we also enhance and upgrade our skills for analysis of different emotions through learning with our experience in our surroundings. Now, what if the machine is capable of learning through its artificial intelligent skills? The ongoing exploration is being done using the deep learning model concept. This technique is being used to enhance the learning capacity of the machine which is most important in human emotion recognition because one emotion can be derived towards another type of emotion which is difficult to analyse. This theme has inclined us to explore this problem. The proposed method has been designed to categorize the human emotions through four different deep learning models, which are convolutional neural network (CNN), recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent unit (GRU). For training these models, well-known physical and perceptual features have been fitted. The system has been tested on the benchmark data of Ryerson Audio-Visual Dataset for Emotional Speech and Song (RAVDESS). Furthermore, the mentioned deep learning model has been compared based on testing the above dataset in terms of the vanishing gradient problem. In addition, an upgraded model of LSTM has been proposed to get better accuracy and it is being tested with the existing model of LSTM.
引用
收藏
页码:101 / 112
页数:12
相关论文
共 50 条
  • [1] Fingerspelled and Printed Words Are Recoded into a Speech-based Code in Short-term Memory
    Sehyr, Zed Sevcikova
    Petrich, Jennifer
    Emmorey, Karen
    [J]. JOURNAL OF DEAF STUDIES AND DEAF EDUCATION, 2017, 22 (01): : 72 - 87
  • [2] An attention Long Short-Term Memory based system for automatic classification of speech intelligibility
    Fernandez-Diaz, Miguel
    Gallardo-Antolin, Ascension
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96
  • [3] Emotion Recognition From Speech and Text using Long Short-Term Memory
    Venkateswarlu, Sonagiri China
    Jeevakala, Siva Ramakrishna
    Kumar, Naluguru Udaya
    Munaswamy, Pidugu
    Pendyala, Dhanalaxmi
    [J]. ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2023, 13 (04) : 11166 - 11169
  • [4] Speech Emotion Recognition for Indonesian Language Using Long Short-Term Memory
    Lasiman, Jeremia Jason
    Lestari, Dessi Puji
    [J]. 2018 INTERNATIONAL CONFERENCE ON COMPUTER, CONTROL, INFORMATICS AND ITS APPLICATIONS (IC3INA), 2018, : 40 - 43
  • [5] EEG-Based Emotion Classification Using Long Short-Term Memory Network with Attention Mechanism
    Kim, Youmin
    Choi, Ahyoung
    [J]. SENSORS, 2020, 20 (23) : 1 - 22
  • [6] Time Series-based Spoof Speech Detection Using Long Short-term Memory and Bidirectional Long Short-term Memory
    Mirza, Arsalan R.
    Al-Talabani, Abdulbasit K.
    [J]. ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2024, 12 (02): : 119 - 129
  • [7] Attention-Based Convolution Skip Bidirectional Long Short-Term Memory Network for Speech Emotion Recognition
    Zhang, Huiyun
    Huang, Heming
    Han, Henry
    [J]. IEEE ACCESS, 2021, 9 : 5332 - 5342
  • [8] Multi-head attention-based long short-term memory model for speech emotion recognition
    Zhao, Yan
    Zhao, Li
    Lu, Cheng
    Li, Sunan
    Tang, Chuangao
    Lian, Hailun
    [J]. Journal of Southeast University (English Edition), 2022, 38 (02) : 103 - 109
  • [9] Human activity classification using long short-term memory network
    Welhenge, Anuradhi Malshika
    Taparugssanagorn, Attaphongse
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2019, 13 (04) : 651 - 656
  • [10] Human activity classification using long short-term memory network
    Anuradhi Malshika Welhenge
    Attaphongse Taparugssanagorn
    [J]. Signal, Image and Video Processing, 2019, 13 : 651 - 656