Development of a Vietnamese Large Vocabulary Continuous Speech Recognition System under Noisy Conditions

被引:4
|
作者
Quoc Bao Nguyen [1 ]
Van Tuan Mai [1 ]
Quang Trung Le [1 ]
Ba Quyen Dam [1 ]
Van Hai Do [2 ,3 ]
机构
[1] Viettel Grp, Cyberspace Ctr, Hanoi, Vietnam
[2] Thuyloi Univ, Hanoi, Vietnam
[3] Viettel Grp, Hanoi, Vietnam
关键词
Vietnamese speech recognition; speech corpus; noisy condition; model adaptation; system combination;
D O I
10.1145/3287921.3287938
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we first present our effort to collect a 500-hour corpus for Vietnamese read speech. After that, various techniques such as data augmentation, recurrent neural network language model rescoring, language model adaptation, bottleneck feature, system combination are applied to build the speech recognition system. Our final system achieves a low word error rate at 6.9% on the noisy test set.
引用
收藏
页码:222 / 226
页数:5
相关论文
共 50 条
  • [31] A review of large-vocabulary continuous-speech recognition
    Young, S
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 1996, 13 (05) : 45 - 57
  • [32] Feature selection in mandarin large vocabulary continuous speech recognition
    Zhu, X
    Chen, YN
    Liu, J
    Liu, RS
    [J]. 2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 508 - 511
  • [33] A Segmental CRF Approach to Large Vocabulary Continuous Speech Recognition
    Zweig, Geoffrey
    Nguyen, Patrick
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 152 - 157
  • [34] DISTRIBUTED SUBMODULAR MAXIMIZATION FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Qi, Jun
    Liu, Xu
    Kamijo, Shunshuke
    Tejedor, Javier
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2501 - 2505
  • [35] A word graph algorithm for large vocabulary continuous speech recognition
    Ortmanns, S
    Ney, H
    Aubert, X
    [J]. COMPUTER SPEECH AND LANGUAGE, 1997, 11 (01): : 43 - 72
  • [36] Using a transcription graph for large vocabulary continuous speech recognition
    Li, Z
    OShaughnessy, D
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 121 - 124
  • [37] DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Zhang, Shiliang
    Lei, Ming
    Yan, Zhijie
    Dai, Lirong
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5869 - 5873
  • [38] Large-vocabulary continuous speech recognition: Advances and applications
    Gauvain, JL
    Lamel, L
    [J]. PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1181 - 1200
  • [39] Large Vocabulary Continuous Audio-Visual Speech Recognition
    Sterpu, George
    [J]. ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 538 - 541
  • [40] On designing pronunciation lexicons for large vocabulary, continuous speech recognition
    Lamel, L
    Adda, G
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 6 - 9