CROSS-LINGUAL PHONEME MAPPING FOR LANGUAGE ROBUST CONTEXTUAL SPEECH RECOGNITION

被引:0
|
作者
Patel, Ami [1 ]
Li, David [1 ]
Cho, Eunjoon [1 ]
Aleksic, Petar [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
关键词
cross-lingual; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Standard automatic speech recognition (ASR) systems are increasingly expected to recognize foreign entities, yet doing so while preserving accuracy on native words remains a challenge. We describe a novel approach for recognizing foreign words by injecting them with appropriate pronunciations into the recognizer decoder search space on-the-fly. The pronunciations are generated by mapping pronunciations from the foreign language's lexicon to the target recognizer language's phoneme inventory. The phoneme mapping itself is learned automatically using acoustic coupling of Text-to-speech (TTS) audio and a pronunciation learning algorithm. Evaluation of our algorithm on Google Assistant use cases shows we can improve recognition of media-related queries by incorporating English entity pronunciations in French and German recognizers, with wins/losses ratios of roughly 2-3:1, without hurting recognition on general traffic.
引用
收藏
页码:5924 / 5928
页数:5
相关论文
共 50 条
  • [1] Cross-Lingual Language Modeling for Low-Resource Speech Recognition
    Xu, Ping
    Fung, Pascale
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1134 - 1144
  • [2] Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
    Hu, Ke
    Bruguier, Antoine
    Sainath, Tara N.
    Prabhavalkar, Rohit
    Pundak, Golan
    [J]. INTERSPEECH 2019, 2019, : 2155 - 2159
  • [3] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [4] TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
    Xue, Hongfei
    Shao, Qijie
    Chen, Peikun
    Guo, Pengcheng
    Xie, Lei
    Liu, Jie
    [J]. INTERSPEECH 2023, 2023, : 216 - 220
  • [5] A many-to-one phone mapping approach for cross-lingual speech recognition
    Do, Van Hai
    Chen, Nancy F.
    Lim, Boon Pang
    Hasegawa-Johnson, Mark
    [J]. 2016 IEEE RIVF INTERNATIONAL CONFERENCE ON COMPUTING & COMMUNICATION TECHNOLOGIES, RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF), 2016, : 120 - 124
  • [6] CROSS-LINGUAL SPEECH RECOGNITION BETWEEN LANGUAGES FROM THE SAME LANGUAGE FAMILY
    Zgank, Andrej
    [J]. PROCEEDINGS OF THE ROMANIAN ACADEMY SERIES A-MATHEMATICS PHYSICS TECHNICAL SCIENCES INFORMATION SCIENCE, 2019, 20 (02): : 184 - 191
  • [7] IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS
    Le Minh Nguyen
    Nayak, Shekhar
    Coler, Matt
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 792 - 797
  • [8] Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
    Xu, Qiantong
    Baevski, Alexei
    Auli, Michael
    [J]. INTERSPEECH 2022, 2022, : 2113 - 2117
  • [9] CLIoS: Cross-lingual Induction of Speech Recognition Grammars
    Perera, Nadine
    Pitz, Michael
    Pinkal, Manfred
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2487 - 2494
  • [10] Unsupervised Cross-lingual Representation Learning for Speech Recognition
    Conneau, Alexis
    Baevski, Alexei
    Collobert, Ronan
    Mohamed, Abdelrahman
    Auli, Michael
    [J]. INTERSPEECH 2021, 2021, : 2426 - 2430