Exploring the Effect of Tones for Myanmar Language Speech Recognition Using Convolutional Neural Network (CNN)

被引:1
|
作者
Mon, Aye Nyein [1 ]
Pa, Win Pa [1 ]
Thu, Ye Kyaw [2 ]
机构
[1] Univ Comp Studies, Nat Language Proc Lab, Yangon, Myanmar
[2] Okayama Prefectural Univ, Artificial Intelligence Lab, Okayama, Japan
来源
关键词
Tone information; Automatic Speech Recognition (ASR); Tonal language; Deep Neural Network (DNN); Convolutional Neural Network (CNN);
D O I
10.1007/978-981-10-8438-6_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tone information is very helpful to improve automatic speech recognition (ASR) performance in tonal languages such as Mandarin, Thai, Vietnamese, etc. Since Myanmar language is being considered as a tonal language, the effect of tones on both syllable and word-based ASR performance has been explored. In this work, experiments are done based on the modeling of tones by integrating them into the phoneme set and incorporating them into the Convolutional Neural Network (CNN), state-of-the-art acoustic model. Moreover, to be more effective tone modeling, tonal questions are used to build the phonetic decision tree. With tone information, experiments show that compared with Deep Neural Network (DNN) baseline, the performance of CNN model achieves nearly 2% for word-based ASR or more than 2% for syllable-based ASR improvement over DNN model. As a result, the CNN model with tone information gets 2.43% word error rate (WER) or 2.26% syllable error rate (SER) reductions than without using it.
引用
收藏
页码:314 / 326
页数:13
相关论文
共 50 条
  • [1] Arabic Sign Language Recognition and Generating Arabic Speech Using Convolutional Neural Network
    Kamruzzaman, M. M.
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2020, 2020
  • [2] Exploring Convolutional Neural Network Structures and Optimization Techniques for Speech Recognition
    Abdel-Hamid, Ossama
    Deng, Li
    Yu, Dong
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3365 - 3369
  • [3] A Method of Speech Coding for Speech Recognition Using a Convolutional Neural Network
    Kubanek, Mariusz
    Bobulski, Janusz
    Kulawik, Joanna
    [J]. SYMMETRY-BASEL, 2019, 11 (09): : 1 - 12
  • [4] Visual Speech Recognition for Kannada Language Using VGG16 Convolutional Neural Network
    Rudregowda, Shashidhar
    Kulkarni, Sudarshan Patil
    Gururaj, H. L.
    Ravi, Vinayakumar
    Krichen, Moez
    [J]. ACOUSTICS, 2023, 5 (01): : 343 - 353
  • [5] Speech recognition for people with dysphasia using convolutional neural network
    Lin, Bo-Yu
    Huang, Hung-Shing
    Sheu, Ruey-Kai
    Chang, Yue-Shan
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 2164 - 2169
  • [6] Dysarthric Speech Recognition Using Convolutional LSTM Neural Network
    Kim, Myungjong
    Cao, Beiming
    An, Kwanghoon
    Wang, Jun
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2948 - 2952
  • [7] Recognition of Urdu Handwritten Alphabet Using Convolutional Neural Network (CNN)
    Ahmed, Gulzar
    Alyas, Tahir
    Iqbal, Muhammad Waseem
    Ashraf, Muhammad Usman
    Alghamdi, Ahmed Mohammed
    Bahaddad, Adel A.
    Almarhabi, Khalid Ali
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (02): : 2967 - 2984
  • [8] Recognition Bangla Sign Language using Convolutional Neural Network
    Islalm, Md Shafiqul
    Rahman, Md Moklesur
    Rahman, Md. Hafizur
    Arifuzzaman, Md
    Sassi, Roberto
    Aktaruzzaman, Md
    [J]. 2019 INTERNATIONAL CONFERENCE ON INNOVATION AND INTELLIGENCE FOR INFORMATICS, COMPUTING, AND TECHNOLOGIES (3ICT), 2019,
  • [9] Indonesia Sign Language Recognition using Convolutional Neural Network
    Dwijayanti, Suci
    Hermawati
    Taqiyyah, Sahirah Inas
    Hikmarika, Hera
    Suprapto, Bhakti Yudho
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (10) : 415 - 422
  • [10] Bangla Sign Language Recognition using Convolutional Neural Network
    Yasir, Farhad
    Prasad, P. W. C.
    Alsadoon, Abeer
    Elchouemi, A.
    Sreedharan, Sasikumaran
    [J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, INSTRUMENTATION AND CONTROL TECHNOLOGIES (ICICICT), 2017, : 49 - 53