A Hybrid Convolutional Bi-Directional Gated Recurrent Unit System for Spoken Languages of JK and Ladakhi

被引:0
|
作者
Thukroo, Irshad Ahmad [1 ]
Bashir, Rumaan [1 ]
Giri, Kaiser J. J. [1 ]
机构
[1] Islamic Univ Sci & Technol, Dept Comp Sci, 1 Univ Ave, Pulwama 192122, Jammu & Kashmir, India
关键词
Language identification; convolutional neural network; long short-term memory; Bi-directional gated recurrent unit; IIITH-ILSC; FEATURE-SELECTION METHOD; NEURAL-NETWORK; IDENTIFICATION; RECOGNITION;
D O I
10.1142/S0219649223500284
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Spoken language identification is the process of recognising language in an audio segment and is the precursor for several technologies such as automatic call routing, language recognition, multilingual conversation, language parsing, and sentimental analysis. Language identification has become a challenging task for low-resource languages like Kashmiri and Ladakhi spoken in the UT's of Jammu and Kashmir (JK) and Ladakh, India. This is mainly due to speaker variations like duration, moderator, and ambiance particularly when training and testing are done on different datasets whilst analysing the accuracy of language identification system in actual implementation, thus producing low accuracy results. In order to tackle this problem, we propose a hybrid convolutional bi-directional gated recurrent unit (Bi-GRU) utilising the effects of both static and dynamic behaviour of the audio signal in order to achieve better results as compared to state-of-the-art models. The audio signals are first converted into two-dimensional structures called Mel-spectrograms to represent the frequency distribution over time. To investigate the spectral behaviour of audio signals, we employ a convolutional neural network (CNN) that perceives Mel-spectrograms in multiple dimensions. The CNN-learned feature vector serves as input to the Bi-GRU that maintains the dynamic behaviour of the audio signal. Experiments are done on six spoken languages, i.e. Ladakhi, Kashmiri, Hindi, Urdu, English, and Dogri. The data corpora used for experimentation are the International Institute of Information Technology Hyderabad-Indian Language Speech Corpus (IIITH-ILSC) and the self-created data corpus for the Ladakhi language. The model is tested on two datasets, i.e. speaker-dependent and speaker-independent. Results show that when validating the efficiency of our proposed model on both speaker-dependent and speaker-independent datasets, we achieve optimal accuracies of 99% and 91%, respectively, thus achieving promising results in comparison to the state-of-the-art models available.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Transient Stability Assessment of Power System Based on Bi-directional Gated Recurrent Unit
    Du Y.
    Hu Z.
    Li B.
    Chen J.
    Weng C.
    [J]. Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2021, 45 (20): : 103 - 112
  • [2] Traffic flow prediction using bi-directional gated recurrent unit method
    Shengyou Wang
    Chunfu Shao
    Jie Zhang
    Yan Zheng
    Meng Meng
    [J]. Urban Informatics, 1 (1):
  • [3] HCovBi-Caps: Hate Speech Detection Using Convolutional and Bi-Directional Gated Recurrent Unit With Capsule Network
    Khan, Shakir
    Kamal, Ashraf
    Fazil, Mohd
    Alshara, Mohammed Ali
    Sejwal, Vineet Kumar
    Alotaibi, Reemiah Muneer
    Baig, Abdul Rauf
    Alqahtani, Salihah
    [J]. IEEE ACCESS, 2022, 10 : 7881 - 7894
  • [4] Bi-Directional Gated Recurrent Unit Based Ensemble Model for the Early Detection of Sepsis
    Wickramaratne, Sajila D.
    Mahmud, Md Shaad
    [J]. 42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 70 - 73
  • [5] Bi-directional gated recurrent unit neural network based nonlinear equalizer for coherent optical communication system
    Liu, Xinyu
    Wang, Yongjun
    Wang, Xishuo
    Xu, Hui
    Li, Chao
    Xin, Xiangjun
    [J]. OPTICS EXPRESS, 2021, 29 (04) : 5923 - 5933
  • [6] Target-dependent Sentiment Analysis of Tweets using a Bi-directional Gated Recurrent Unit
    Jabreel, Mohammed
    Moreno, Antonio
    [J]. WEBIST: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2017, : 80 - 87
  • [7] 3D Human Motion Prediction Based on Bi-directional Gated Recurrent Unit
    Sang Haifeng
    Chen Zizhen
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2019, 41 (09) : 2256 - 2263
  • [9] Improving Speech Emotion Recognition Using Graph Attentive Bi-directional Gated Recurrent Unit Network
    Su, Bo-Hao
    Chang, Chun-Min
    Lin, Yun-Shao
    Lee, Chi-Chun
    [J]. INTERSPEECH 2020, 2020, : 506 - 510
  • [10] Bi-directional gated recurrent unit recurrent neural networks for failure prognosis of proton exchange membrane fuel cells
    Zhang, Rufeng
    Chen, Tao
    Xiao, Fei
    Luo, Jiale
    [J]. INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2022, 47 (77) : 33027 - 33038