A Hybrid Convolutional Bi-Directional Gated Recurrent Unit System for Spoken Languages of JK and Ladakhi

被引:0
|
作者
Thukroo, Irshad Ahmad [1 ]
Bashir, Rumaan [1 ]
Giri, Kaiser J. J. [1 ]
机构
[1] Islamic Univ Sci & Technol, Dept Comp Sci, 1 Univ Ave, Pulwama 192122, Jammu & Kashmir, India
关键词
Language identification; convolutional neural network; long short-term memory; Bi-directional gated recurrent unit; IIITH-ILSC; FEATURE-SELECTION METHOD; NEURAL-NETWORK; IDENTIFICATION; RECOGNITION;
D O I
10.1142/S0219649223500284
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Spoken language identification is the process of recognising language in an audio segment and is the precursor for several technologies such as automatic call routing, language recognition, multilingual conversation, language parsing, and sentimental analysis. Language identification has become a challenging task for low-resource languages like Kashmiri and Ladakhi spoken in the UT's of Jammu and Kashmir (JK) and Ladakh, India. This is mainly due to speaker variations like duration, moderator, and ambiance particularly when training and testing are done on different datasets whilst analysing the accuracy of language identification system in actual implementation, thus producing low accuracy results. In order to tackle this problem, we propose a hybrid convolutional bi-directional gated recurrent unit (Bi-GRU) utilising the effects of both static and dynamic behaviour of the audio signal in order to achieve better results as compared to state-of-the-art models. The audio signals are first converted into two-dimensional structures called Mel-spectrograms to represent the frequency distribution over time. To investigate the spectral behaviour of audio signals, we employ a convolutional neural network (CNN) that perceives Mel-spectrograms in multiple dimensions. The CNN-learned feature vector serves as input to the Bi-GRU that maintains the dynamic behaviour of the audio signal. Experiments are done on six spoken languages, i.e. Ladakhi, Kashmiri, Hindi, Urdu, English, and Dogri. The data corpora used for experimentation are the International Institute of Information Technology Hyderabad-Indian Language Speech Corpus (IIITH-ILSC) and the self-created data corpus for the Ladakhi language. The model is tested on two datasets, i.e. speaker-dependent and speaker-independent. Results show that when validating the efficiency of our proposed model on both speaker-dependent and speaker-independent datasets, we achieve optimal accuracies of 99% and 91%, respectively, thus achieving promising results in comparison to the state-of-the-art models available.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] CAT-BiGRU: Convolution and Attention with Bi-Directional Gated Recurrent Unit for Self-Deprecating Sarcasm Detection
    Ashraf Kamal
    Muhammad Abulaish
    [J]. Cognitive Computation, 2022, 14 : 91 - 109
  • [22] Temperature evolution prediction for laser directed energy deposition enabled by finite element modelling and bi-directional gated recurrent unit
    Hu, Kai-Xiong
    Guo, Kai
    Li, Wei-Dong
    Wang, Yang-Hui
    [J]. ADVANCES IN MANUFACTURING, 2024,
  • [23] A Hybrid Bi-Directional IPT System with Improved Spatial Tolerance
    Zhao, Lei
    Thrimawithana, Duleepa J.
    Madawala, Udaya K.
    [J]. 2015 IEEE 2ND INTERNATIONAL FUTURE ENERGY ELECTRONICS CONFERENCE (IFEEC), 2015,
  • [24] Bi-directional Recurrent End-to-End Neural Network Classifier for Spoken Arab Digit Recognition
    Zerari, Naima
    Abdelhamid, Samir
    Bouzgou, Hassen
    Raymond, Christian
    [J]. 2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 52 - 57
  • [25] The Research on Bi-Directional DC/DC Converter for Hybrid Power System
    Liu, Guodong
    Ji, Zhipo
    Qiu, Ruichang
    Wang, Xiang
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL AND INFORMATION TECHNOLOGIES FOR RAIL TRANSPORTATION (EITRT) 2017: ELECTRICAL TRACTION, 2018, 482 : 405 - 414
  • [26] Study on Hybrid Control Strategy for Bi-directional Power Conversion System
    Yang, Huibiao
    Bai, Cunxi
    Li, Xutao
    Xue, Fei
    Ge, Yixian
    [J]. PROCEEDINGS OF THE 2018 13TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2018), 2018, : 227 - 231
  • [27] CSO/CTB/BER performances improvement in a bi-directional hybrid DWDM system
    吕海涵
    黄旭弘
    王明傳
    苏恒生
    [J]. Chinese Optics Letters, 2003, (04) : 193 - 195
  • [28] Energy Management System with Bi-directional Converter on Hybrid Sources Electric Scooters
    Cheng, Chang-Yi
    Hu, Jia-Sheng
    Hsieh, Min-Fu
    [J]. IECON 2017 - 43RD ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2017, : 2281 - 2286
  • [29] CSO/CTB/BER performances improvement in a bi-directional hybrid DWDM system
    Lu, Hai-Han
    Huang, Hsu-Hung
    Wang, Ming-Chuan
    Su, Heng-Sheng
    [J]. Chinese Optics Letters, 2003, 1 (04) : 193 - 195
  • [30] FGBRSN: Flow-Guided Gated Bi-Directional Recurrent Separated Network for Video Super-Resolution
    Xue, Weikang
    Gao, Lihang
    Hu, Shuiyi
    Yu, Tianqi
    Hu, Jianling
    [J]. IEEE ACCESS, 2023, 11 : 103419 - 103430