Leveraging native language information for improved accented speech recognition

被引:10
|
作者
Ghorbani, Shahram [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, CRSS, Richardson, TX 75080 USA
关键词
recurrent neural network; acoustic modeling; accented speech; multilingual;
D O I
10.21437/Interspeech.2018-1378
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognition of accented speech is a long-standing challenge for automatic speech recognition (ASR) systems, given the increasing worldwide population of bi-lingual speakers with English as their second language. If we consider foreign-accented speech as an interpolation of the native language (L1) and English (L2), using a model that can simultaneously address both languages would perform better at the acoustic level for accented speech. In this study, we explore how an end-to-end recurrent neural network (RNN) trained system with English and native languages (Spanish and Indian languages) could leverage data of native languages to improve performance for accented English speech. To this end, we examine pre-training with native languages, as well as multi-task learning (MTL) in which the main task is trained with native English and the secondary task is trained with Spanish or Indian Languages. We show that the proposed MTL model performs better than the pre-training approach and outperforms a baseline model trained simply with English data. We suggest a new setting for MTL in which the secondary task is trained with both English and the native language, using the same output set. This proposed scenario yields better performance with +11.95% and +17.55% character error rate gains over baseline for Hispanic and Indian accents, respectively.
引用
收藏
页码:2449 / 2453
页数:5
相关论文
共 50 条
  • [1] Speech Recognition of Accented Mandarin Based on Improved Conformer
    Yang, Xing-Yao
    Zhang, Shao-Dong
    Xiao, Rui
    Yu, Jiong
    Li, Zi-Yang
    [J]. SENSORS, 2023, 23 (08)
  • [2] Recognition of foreign-accented vocoded speech by native English listeners
    Yang, Jing
    Barrett, Jenna
    Yin, Zhigang
    Xu, Li
    [J]. ACTA ACUSTICA, 2023, 7
  • [3] Leveraging Speech Production Knowledge for Improved Speech Recognition
    Sangwan, Abhijeet
    Hansen, John H. L.
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 58 - 63
  • [4] Orthographic information facilitates discrimination of native and foreign-accented speech
    Sevich, Victoria
    Clausing, Emily M.
    Moberly, Aaron C.
    Tamati, Terrin N.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
  • [5] Leveraging relevance cues for language modeling in speech recognition
    Chen, Berlin
    Chen, Kuan-Yu
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (04) : 807 - 816
  • [6] Improving Language Identification of Accented Speech
    Kukk, Kunnar
    Alumae, Tanel
    [J]. INTERSPEECH 2022, 2022, : 1288 - 1292
  • [7] Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition
    Li, Sheng
    Li, Jiyi
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 389 - 400
  • [8] Effects of Listener Age and Native Language Experience on Recognition of Accented and Unaccented English Words
    Gordon-Salant, Sandra
    Yeni-Komshian, Grace H.
    Bieber, Rebecca E.
    Ureta, David A. Jara
    Freund, Maya S.
    Fitzgibbons, Peter J.
    [J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2019, 62 (04): : 1131 - 1143
  • [9] Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning
    Jain, Abhinav
    Upreti, Minali
    Jyothi, Preethi
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2454 - 2458
  • [10] The role of acoustic similarity in listening to foreign-accented speech: Recognition of Spanish-accented English words by Japanese native listeners
    Matsui, Sanae
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2024, 45 (04) : 216 - 223