SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL

被引:0
|
作者
Yamamoto, Hirofumi [1 ]
Kikui, Genichiro [2 ]
Nakamura, Satoshi [1 ,2 ]
Sagisaka, Yoshinori [1 ,3 ]
机构
[1] Natl Inst Informat & Commun Technol, 2-2-2 Hikaridai, Seika, Kyoto, Japan
[2] ATR Spoken Language Commun Res Labs, Kyoto, Japan
[3] Waseda Univ, GITI, Tokyo, Japan
关键词
Speech Recognition; Language model; Foreign word; Out-of-Vocabulalry word; Hierarchical language model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new speech recognition scheme for foreign out-of-vocabulary words embedded in native-language speech. To recognize foreign names frequently observed in news speech or in translation speech, we adopted a hierarchical language model that had been successfully applied to OOV words covering native vocabularies. In this hierarchical language model, OOV vocabularies are modeled as a word-class model in the upper-layered model, and their statistical phonotactic constraints are modeled in the lower-layered model. Since extra statistics are needed to cover foreign words and their pronunciation differences, we have introduced two techniques. The first is to combine translation target language models and translation source statistics of OOVs using the hierarchical language model. The second is to automatically generate recognition target pronunciations from original pronunciations by syllable-to-syllable mapping. To confirm the validity of this recognition scheme, we have conducted speech recognition experiments using English speech including Japanese personal names as OOV words. The proposed method outperformed the existing algorithm using a lexicon consisting of all the words in the training set. Surprisingly, it achieved better OOV recognition results than the non-OOV condition where all the proper names in the test set are registered in the lexicon.
引用
收藏
页码:1870 / +
页数:2
相关论文
共 50 条
  • [41] Querying out-of-vocabulary words in lexicon-based keyword spotting
    Puigcerver, Joan
    Toselli, Alejandro H.
    Vidal, Enrique
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 (09): : 2373 - 2382
  • [42] Querying out-of-vocabulary words in lexicon-based keyword spotting
    Joan Puigcerver
    Alejandro H. Toselli
    Enrique Vidal
    Neural Computing and Applications, 2017, 28 : 2373 - 2382
  • [43] Hybrid neural-network/HMM approach for Out-of-Vocabulary words rejection in Mandarin place name recognition
    Ou, JZ
    Chen, KJ
    Li, ZG
    8TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING, VOLS 1-3, PROCEEDING, 2001, : 399 - 404
  • [44] Out-of-vocabulary rejection based on selective attention model
    Park, KY
    Lee, SY
    NEURAL PROCESSING LETTERS, 2000, 12 (01) : 41 - 48
  • [45] Out-of-Vocabulary Rejection based on Selective Attention Model
    Ki-Young Park
    Soo-Young Lee
    Neural Processing Letters, 2000, 12 : 41 - 48
  • [46] Bilingual Character Representation for Efficiently Addressing Out-of-Vocabulary Words in Code-Switching Named Entity Recognition
    Winata, Genta Indra
    Wu, Chien-Sheng
    Madotto, Andrea
    Fung, Pascale
    COMPUTATIONAL APPROACHES TO LINGUISTIC CODE-SWITCHING, 2018, : 110 - 114
  • [47] Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition
    Yazgan, A
    Saraclar, M
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 745 - 748
  • [48] Exploring Edit Distance for Normalising Out-of-Vocabulary Malay Words on Social Media
    Athirah, Raja Roza
    Soon, Lay-Ki
    Haw, Su-Cheng
    ENGINEERING APPLICATION OF ARTIFICIAL INTELLIGENCE CONFERENCE 2018 (EAAIC 2018), 2019, 255
  • [49] Memory-based phoneme-to-grapheme conversion A method for dealing with out-of-vocabulary items in speech recognition
    Decadt, B
    Duchateau, J
    Daelemans, W
    Wambacq, P
    COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS 2001, 2002, (45): : 47 - 61
  • [50] Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition
    Masumura, Ryo
    Hahm, Seongjun
    Ito, Akinori
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2011, : 1465 - 1468