A Comparative Study of Khasi Speech Recognition Systems with Recurrent Neural Network-Based Language Model

被引:0
|
作者
Deepajothi, S. [1 ]
Rao, Vuda Sreenivasa [2 ]
Ambhika, C. [3 ]
Mandala, Vishwanadham [4 ]
Rao, R. V. V. N. Bheema [5 ]
Kumar, Shailendra [6 ]
Gera, Venkateswara Rao [7 ]
Nagaraju, D. [8 ]
机构
[1] SRM Inst Sci & Technol, Dept Comp Technol, Kattankulathur 603203, Tamil Nadu, India
[2] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Vaddeswaram 522302, Andhra Pradesh, India
[3] RMD Engn Coll, Dept AIML, Rsm Nagar, Kavarapetai, India
[4] Indiana Univ, Bloomington, IN USA
[5] Aditya Coll Engn & Technol, Dept Informat Technol, Surampalem, India
[6] Integral Univ Lucknow, Dept ECE, Lucknow 226026, Uttar Pradesh, India
[7] Kallam Haranadhareddy Inst Technol, Dept ECE, Guntur, India
[8] Sri Venkatesa Perumal Coll Engn & Technol, Dept CSE, Puttur, Andhra Pradesh, India
关键词
Hidden Markov model; Language model; Perceptual linear prediction; Gaussian mixture model; Acoustic model;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper offers a comparative analysis of Khasi speech recognition systems utilizing a recurrent neural network-based language model (RNN-LM). Develop different acoustic models (AMs) to evaluate the optimal performance. This paper observed that using RNN-LM performed best than traditional other models. The wave surfer performs data processing followed by collecting the recorder based continuous speech database. Moreover, a minimization of word error rate (WER) in 2.83.8% range for major speech data and 2.4-3.5% for minor speech data. Additionally, two acoustic features are used, and from the experimental results, the Mel frequency cepstral coefficient (MFCC) yielded improved performance than the perceptual linear prediction (PLP).
引用
收藏
页码:1296 / 1305
页数:10
相关论文
共 50 条
  • [41] Enhancing recurrent neural network-based language models by word tokenization
    Noaman, Hatem M.
    Sarhan, Shahenda S.
    Rashwan, Mohsen. A. A.
    [J]. HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2018, 8
  • [42] Stochastic Recurrent Neural Network for Speech Recognition
    Chien, Jen-Tzung
    Shen, Chen
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1313 - 1317
  • [43] Recurrent Neural Network based Language Modeling in Meeting Recognition
    Kombrink, Stefan
    Mikolov, Tomas
    Karafiat, Martin
    Burget, Lukas
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2888 - 2891
  • [44] Recurrent Neural Network-based Internal Model Control design for stable nonlinear systems
    Bonassi, Fabio
    Scattolini, Riccardo
    [J]. EUROPEAN JOURNAL OF CONTROL, 2022, 65
  • [45] Recurrent Neural Network-based Language Models with Variation in Net Topology, Language, and Granularity
    Yang, Tzu-Hsuan
    Tseng, Tzu-Hsuan
    Chen, Chia-Ping
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 71 - 74
  • [46] Bag-of-Words Input for Long History Representation in Neural Network-based Language Models for Speech Recognition
    Irie, Kazuki
    Schlueter, Ralf
    Ney, Hermann
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2371 - 2375
  • [47] Speech Recognition Model for Assamese Language Using Deep Neural Network
    Singh, Moirangthem Tiken
    Barman, Partha Pratim
    Gogoi, Rupjyoti
    [J]. 2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 2722 - 2727
  • [48] A Comparative Study of Three Speech Recognition Systems for Romanian Language
    Schiopu, Daniela
    [J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON VIRTUAL LEARNING, ICVL 2010, 2010, : 318 - 324
  • [49] A Dilated Recurrent Neural Network-Based Model for Graph Embedding
    Han, Xiao
    Zhang, Chunhong
    Ji, Yang
    Hu, Zheng
    [J]. IEEE ACCESS, 2019, 7 : 32085 - 32092
  • [50] ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation
    Visin, Francesco
    Romero, Adriana
    Cho, Kyunghyun
    Matteucci, Matteo
    Ciccone, Marco
    Kastner, Kyle
    Bengio, Yoshua
    Courville, Aaron
    [J]. PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 426 - 433