A Novel Approach for Vietnamese Speech Recognition Using Conformer

被引:0
|
作者
Tuan, Nguyen Van Anh [1 ]
Hoa, Nguyen Thi Thanh [1 ]
Dat, Nguyen Thanh [1 ]
Tuan, Pham Minh [1 ]
Truong, Dao Duy [1 ]
Phuc, Dang Thi [1 ]
机构
[1] Ind Univ Ho Chi Minh City, Fac Informat Technol, Ho Chi Minh City, Vietnam
关键词
Deep learning; CTC Joint CTC/Attention; Conformer; Vietnamese speech recognition;
D O I
10.1007/978-981-19-8069-5_53
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Research on speech recognition has existed for a long time, but there is very little research on applying deep learning to Vietnamese language speech recognition. In this paper, we solve the Vietnamese speech recognition problem by deep learning speech recognition frameworks including CTC and Joint CTC/Attention combined with encoder architectures Conformer. Experimental results achieved moderate accuracy using over 115 h of training data of VLSP and Vivos. Compared with the other models, the training results show that the Conformer model trained on CTC achieved good results with a WER value of 20%. Training on big data gives remarkable results and is the basis for us to continue improving the model and increasing accuracy in the future.
引用
收藏
页码:723 / 730
页数:8
相关论文
共 50 条
  • [1] A Novel Approach in Continuous Speech Recognition for Vietnamese, an isolating tonal language
    Nguyen Hong Quang
    Nocera, Pascal
    Castelli, Eric
    Trinh Van Loan
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1149 - +
  • [2] Vietnamese automatic speech recognition: The FLaVoR approach
    Vu, Quan
    Demuynck, Kris
    Van Compernolle, Dirk
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 464 - +
  • [3] A generic approach for the Vietnamese handwritten and speech recognition problems
    Quan, VH
    Trung, PN
    Ha, NDH
    Tin, LT
    Kiem, H
    Nguyen, AH
    [J]. DEVELOPMENTS IN APPLIED ARTIFICAIL INTELLIGENCE, PROCEEDINGS, 2002, 2358 : 47 - 56
  • [4] A Korean speech recognition based on conformer
    Koo, Myoung-Wan
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 488 - 495
  • [5] Deep Sparse Conformer for Speech Recognition
    Wu, Xianchao
    [J]. INTERSPEECH 2022, 2022, : 2073 - 2077
  • [6] A Novel Approach for Vietnamese Handwritten Text Recognition
    Duong, Viet Hang
    Nguyen, Hung Tuan
    Nakagawa, Masaki
    Pham, The Bao
    [J]. AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2023, 57 (05) : 534 - 541
  • [7] A Novel Approach for Vietnamese Handwritten Text Recognition
    Hung Tuan Viet Hang Duong
    Masaki Nguyen
    [J]. Automatic Control and Computer Sciences, 2023, 57 : 534 - 541
  • [8] Recognition of human speech phonemes using a novel fuzzy approach
    Halavati, Ramin
    Shouraki, Saeed Bagheri
    Zadeh, Saman Harati
    [J]. APPLIED SOFT COMPUTING, 2007, 7 (03) : 828 - 839
  • [9] Vietnamese Speech Command Recognition using Recurrent Neural Networks
    Phan Duy Hung
    Truong Minh Giang
    Le Hoang Nam
    Phan Minh Duong
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (07) : 194 - 201
  • [10] A novel fuzzy approach to speech recognition
    Halavati, R
    Shouraki, SB
    Eshraghi, M
    Alemzadeh, M
    Ziaie, P
    [J]. HIS'04: Fourth International Conference on Hybrid Intelligent Systems, Proceedings, 2005, : 340 - 345