A Comparative Study of Khasi Speech Recognition Systems with Recurrent Neural Network-Based Language Model

被引:0
|
作者
Deepajothi, S. [1 ]
Rao, Vuda Sreenivasa [2 ]
Ambhika, C. [3 ]
Mandala, Vishwanadham [4 ]
Rao, R. V. V. N. Bheema [5 ]
Kumar, Shailendra [6 ]
Gera, Venkateswara Rao [7 ]
Nagaraju, D. [8 ]
机构
[1] SRM Inst Sci & Technol, Dept Comp Technol, Kattankulathur 603203, Tamil Nadu, India
[2] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Vaddeswaram 522302, Andhra Pradesh, India
[3] RMD Engn Coll, Dept AIML, Rsm Nagar, Kavarapetai, India
[4] Indiana Univ, Bloomington, IN USA
[5] Aditya Coll Engn & Technol, Dept Informat Technol, Surampalem, India
[6] Integral Univ Lucknow, Dept ECE, Lucknow 226026, Uttar Pradesh, India
[7] Kallam Haranadhareddy Inst Technol, Dept ECE, Guntur, India
[8] Sri Venkatesa Perumal Coll Engn & Technol, Dept CSE, Puttur, Andhra Pradesh, India
关键词
Hidden Markov model; Language model; Perceptual linear prediction; Gaussian mixture model; Acoustic model;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper offers a comparative analysis of Khasi speech recognition systems utilizing a recurrent neural network-based language model (RNN-LM). Develop different acoustic models (AMs) to evaluate the optimal performance. This paper observed that using RNN-LM performed best than traditional other models. The wave surfer performs data processing followed by collecting the recorder based continuous speech database. Moreover, a minimization of word error rate (WER) in 2.83.8% range for major speech data and 2.4-3.5% for minor speech data. Additionally, two acoustic features are used, and from the experimental results, the Mel frequency cepstral coefficient (MFCC) yielded improved performance than the perceptual linear prediction (PLP).
引用
收藏
页码:1296 / 1305
页数:10
相关论文
共 50 条
  • [1] Recurrent Neural Network-based Language Modeling for an Automatic Russian Speech Recognition System
    Kipyatkova, Irina
    Karpov, Alexey
    [J]. 2015 ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE AND INFORMATION EXTRACTION, SOCIAL MEDIA AND WEB SEARCH FRUCT CONFERENCE (AINL-ISMW FRUCT), 2015, : 33 - 38
  • [2] ACCELERATING RECURRENT NEURAL NETWORK LANGUAGE MODEL BASED ONLINE SPEECH RECOGNITION SYSTEM
    Lee, Kyungmin
    Park, Chiyoun
    Kim, Namhoon
    Lee, Jaewon
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5904 - 5908
  • [3] Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition
    Li, Ke
    Xu, Hainan
    Wang, Yiming
    Povey, Daniel
    Khudanpur, Sanjeev
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3373 - 3377
  • [4] Recurrent Neural Network Language Model with Part-of-speech for Mandarin Speech Recognition
    Gong, Caixia
    Li, Xiangang
    Wu, Xihong
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 459 - 463
  • [5] A Recurrent Neural Network-Based Approach to Automatic Language Identification from Speech
    Mukherjee, Himadri
    Dhar, Ankita
    Obaidullah, Sk Md
    Santosh, K. C.
    Phadikar, Santanu
    Roy, Kaushik
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, DEVICES AND COMPUTING, 2020, 602 : 441 - 450
  • [6] CACHE BASED RECURRENT NEURAL NETWORK LANGUAGE MODEL INFERENCE FOR FIRST PASS SPEECH RECOGNITION
    Huang, Zhiheng
    Zweig, Geoffrey
    Dumoulin, Benoit
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] Integrating Prosodic Information into Recurrent Neural Network Language Model For Speech Recognition
    Fu, Tong
    Han, Yang
    Li, Xiangang
    Liu, Yi
    Wu, Xihong
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1194 - 1197
  • [8] RECURRENT NEURAL NETWORK LANGUAGE MODEL WITH STRUCTURED WORD EMBEDDINGS FOR SPEECH RECOGNITION
    He, Tianxing
    Xiang, Xu
    Qian, Yanmin
    Yu, Kai
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5396 - 5400
  • [9] Effect of vocal tract dynamics on neural network-based speech recognition: A Bengali language-based study
    Hasan, Md Rakibul
    Hasan, Md Mahbub
    Hossain, Md Zakir
    [J]. Expert Systems, 2022, 39 (09):
  • [10] Effect of vocal tract dynamics on neural network-based speech recognition: A Bengali language-based study
    Hasan, Md Rakibul
    Hasan, Md Mahbub
    Hossain, Md Zakir
    [J]. EXPERT SYSTEMS, 2022, 39 (09)