Development of a Vietnamese Large Vocabulary Continuous Speech Recognition System under Noisy Conditions

被引：4

作者：

Quoc Bao Nguyen ^{[1
]}

Van Tuan Mai ^{[1
]}

Quang Trung Le ^{[1
]}

Ba Quyen Dam ^{[1
]}

Van Hai Do ^{[2
,3
]}

机构：

[1] Viettel Grp, Cyberspace Ctr, Hanoi, Vietnam

[2] Thuyloi Univ, Hanoi, Vietnam

[3] Viettel Grp, Hanoi, Vietnam

来源：

PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY (SOICT 2018) | 2018年

关键词：

Vietnamese speech recognition; speech corpus; noisy condition; model adaptation; system combination;

D O I：

10.1145/3287921.3287938

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we first present our effort to collect a 500-hour corpus for Vietnamese read speech. After that, various techniques such as data augmentation, recurrent neural network language model rescoring, language model adaptation, bottleneck feature, system combination are applied to build the speech recognition system. Our final system achieves a low word error rate at 6.9% on the noisy test set.

引用

页码：222 / 226

页数：5

共 50 条

[31] A review of large-vocabulary continuous-speech recognition
Young, S
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 1996, 13 (05) : 45 - 57
[32] Feature selection in mandarin large vocabulary continuous speech recognition
Zhu, X
Chen, YN
Liu, J
Liu, RS
[J]. 2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 508 - 511
[33] A Segmental CRF Approach to Large Vocabulary Continuous Speech Recognition
Zweig, Geoffrey
Nguyen, Patrick
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 152 - 157
[34] DISTRIBUTED SUBMODULAR MAXIMIZATION FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Qi, Jun
Liu, Xu
Kamijo, Shunshuke
Tejedor, Javier
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2501 - 2505
[35] A word graph algorithm for large vocabulary continuous speech recognition
Ortmanns, S
Ney, H
Aubert, X
[J]. COMPUTER SPEECH AND LANGUAGE, 1997, 11 (01): : 43 - 72
[36] Using a transcription graph for large vocabulary continuous speech recognition
Li, Z
OShaughnessy, D
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 121 - 124
[37] DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Zhang, Shiliang
Lei, Ming
Yan, Zhijie
Dai, Lirong
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5869 - 5873
[38] Large-vocabulary continuous speech recognition: Advances and applications
Gauvain, JL
Lamel, L
[J]. PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1181 - 1200
[39] Large Vocabulary Continuous Audio-Visual Speech Recognition
Sterpu, George
[J]. ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 538 - 541
[40] On designing pronunciation lexicons for large vocabulary, continuous speech recognition
Lamel, L
Adda, G
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 6 - 9

← 1 2 3 4 5 →