Automatic Recognition of Kazakh Speech Using Deep Neural Networks

被引:14
|
作者
Mamyrbayev, Orken [1 ]
Turdalyuly, Mussa [1 ]
Mekebayev, Nurbapa [2 ]
Alimhan, Keylan [1 ]
Kydyrbekova, Aizat [2 ]
Turdalykyzy, Tolganay [1 ]
机构
[1] Inst Informat & Computat Technol, Alma Ata 050010, Kazakhstan
[2] al Farabi Kazakh Natl Univ, Alma Ata 050040, Kazakhstan
关键词
DNN; ASR; Kazakh speech recognition; LM;
D O I
10.1007/978-3-030-14802-7_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents a deep neural network (DNN) system based on automatic speech recognition for Kazakh language, developed using the Kaldi speech recognition tool. DNNs are initialized using the restricted Boltzmann machines (RBM) and are trained using cross-entropy as the objective function and the standard back propagation of error. In order to achieve optimal results, the training has been modified based on peculiarities of Kazakh language. A 76 hours-corpus has been used in training. Results are compared for two different sets of values between classical models and various DNN settings.
引用
收藏
页码:465 / 474
页数:10
相关论文
共 50 条
  • [31] Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks
    Tkachenko, Maxim
    Yamshinin, Alexander
    Lyubimov, Nikolay
    Kotov, Mikhail
    Nastasenko, Marina
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 690 - 699
  • [32] TOWARDS IMPLICIT COMPLEXITY CONTROL USING VARIABLE-DEPTH DEEP NEURAL NETWORKS FOR AUTOMATIC SPEECH RECOGNITION
    Tan, Shawn
    Sim, Khe Chai
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5965 - 5969
  • [33] DYNAMIC SPARSITY NEURAL NETWORKS FOR AUTOMATIC SPEECH RECOGNITION
    Wu, Zhaofeng
    Zhao, Ding
    Liang, Qiao
    Yu, Jiahui
    Gulati, Anmol
    Pang, Ruoming
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6014 - 6018
  • [34] Robust Speech Recognition with Speech Enhanced Deep Neural Networks
    Du, Jun
    Wang, Qing
    Gao, Tian
    Xu, Yong
    Dai, Lirong
    Lee, Chin-Hui
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 616 - 620
  • [35] DeepTarget: An Automatic Target Recognition Using Deep Convolutional Neural Networks
    Nasrabadi, Nasser M.
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2019, 55 (06) : 2687 - 2697
  • [36] Automatic Speech Recognition for Uyghur, Kazakh, and Kyrgyz: An Overview
    Du, Wenqiang
    Maimaitiyiming, Yikeremu
    Nijat, Mewlude
    Li, Lantian
    Hamdulla, Askar
    Wang, Dong
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [37] Speech recognition using neural networks
    Khan, SU
    Sharma, G
    Rao, PRK
    [J]. PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY 2000, VOLS 1 AND 2, 2000, : 432 - 437
  • [38] SPEECH RECOGNITION USING NEURAL NETWORKS
    Kumar, T. Lalith
    Kumar, T. Kishore
    Rajan, K. Soundar
    [J]. PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2009, : 248 - +
  • [39] AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION
    Mirsamadi, Seyedmahdad
    Barsoum, Emad
    Zhang, Cha
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2227 - 2231
  • [40] Performance prediction of automatic speech recognition systems using convolutional neural networks
    Elloumi, Zied
    Lecouteux, Benjamin
    Galibert, Olivier
    Besacier, Laurent
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2018, 59 (02): : 49 - 76