Enhanced Deep Hierarchical Long Short-Term Memory and Bidirectional Long Short-Term Memory for Tamil Emotional Speech Recognition using Data Augmentation and Spatial Features

被引:1
|
作者
Fernandes, Bennilo [1 ]
Mannepalli, Kasiprasad [1 ]
机构
[1] Koneru Lakshmaiah Educ Fdn, Dept ECE, Guntur 520002, Andhra Pradesh, India
来源
关键词
BILSTM; data augmentation; emotional recognition; LSTM; NETWORKS;
D O I
10.47836/pjst.29.4.39
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Neural networks have become increasingly popular for language modelling and within these large and deep models, overfitting, and gradient remains an important problem that heavily influences the model performance. As long short-term memory (LSTM) and bidirectional long short-term memory (BILSTM) individually solve long-term dependencies in sequential data, the combination of both LSTM and BILSTM in hierarchical gives added reliability to minimise the gradient, overfitting, and long learning issues. Hence, this paper presents four different architectures such as the Enhanced Deep Hierarchal LSTM & BILSTM (EDHLB), EDHBL, EDHLL & EDHBB has been developed. The experimental evaluation of a deep hierarchical network with spatial and temporal features selects good results for four different models. The average accuracy of EDHLB is 92.12%, EDHBL is 93.13, EDHLL is 94.14% & EDHBB is 93.19% and the accuracy level obtained for the basic models such as the LSTM, which is 74% and BILSTM, which is 77%. By evaluating all the models, EDHBL performs better than other models, with an average efficiency of 94.14% and a good accuracy rate of 95.7%. Moreover, the accuracy for the collected Tamil emotional dataset, such as happiness, fear, anger, sadness, and neutral emotions indicates 100% accuracy in a cross-fold matrix. Emotions such as disgust show around 80% efficiency. Lastly, boredom shows 75% accuracy. Moreover, the training time and evaluation time utilised by EDHBL is less when compared with the other models. Therefore, the experimental analysis shows EDHBL as superior to the other models on the collected Tamil emotional dataset. When compared with the basic models, it has attained 20% more efficiency.
引用
收藏
页码:2967 / 2992
页数:26
相关论文
共 50 条
  • [1] Deep Long Short-Term Memory Networks for Speech Recognition
    Chien, Jen-Tzung
    Misbullah, Alim
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [2] Time Series-based Spoof Speech Detection Using Long Short-term Memory and Bidirectional Long Short-term Memory
    Mirza, Arsalan R.
    Al-Talabani, Abdulbasit K.
    [J]. ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2024, 12 (02): : 119 - 129
  • [3] Long Short-term Memory for Tibetan Speech Recognition
    Wang, Weizhe
    Chen, Ziyan
    Yang, Hongwu
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1059 - 1063
  • [4] Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
    Khosrobeigi, Zohreh
    Veisi, Hadi
    Hoseinzade, Ehsan
    Shabanian, Hanieh
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [5] Applying Deep Bidirectional Long Short-Term Memory to Mandarin Tone Recognition
    Yang, Longfei
    Xie, Yanlu
    Zhang, Jinsong
    [J]. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 1124 - 1127
  • [6] Speech Dereverberation Using Long Short-Term Memory
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2435 - 2439
  • [7] BIDIRECTIONAL QUATERNION LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Parcollet, Titouan
    Morchid, Mohamed
    Linares, Georges
    De Mori, Renato
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8519 - 8523
  • [8] Bidirectional Long Short-Term Memory Network for Vehicle Behavior Recognition
    Zhu, Jiasong
    Sun, Ke
    Jia, Sen
    Lin, Weidong
    Hou, Xianxu
    Liu, Bozhi
    Qiu, Guoping
    [J]. REMOTE SENSING, 2018, 10 (06)
  • [9] Long short-term memory
    Hochreiter, S
    Schmidhuber, J
    [J]. NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780
  • [10] Seismic Data Reconstruction Using Deep Bidirectional Long Short-Term Memory With Skip Connections
    Yoon, Daeung
    Yeeh, Zeu
    Byun, Joongmoo
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (07) : 1298 - 1302