SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL

被引:0
|
作者
Yamamoto, Hirofumi [1 ]
Kikui, Genichiro [2 ]
Nakamura, Satoshi [1 ,2 ]
Sagisaka, Yoshinori [1 ,3 ]
机构
[1] Natl Inst Informat & Commun Technol, 2-2-2 Hikaridai, Seika, Kyoto, Japan
[2] ATR Spoken Language Commun Res Labs, Kyoto, Japan
[3] Waseda Univ, GITI, Tokyo, Japan
关键词
Speech Recognition; Language model; Foreign word; Out-of-Vocabulalry word; Hierarchical language model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new speech recognition scheme for foreign out-of-vocabulary words embedded in native-language speech. To recognize foreign names frequently observed in news speech or in translation speech, we adopted a hierarchical language model that had been successfully applied to OOV words covering native vocabularies. In this hierarchical language model, OOV vocabularies are modeled as a word-class model in the upper-layered model, and their statistical phonotactic constraints are modeled in the lower-layered model. Since extra statistics are needed to cover foreign words and their pronunciation differences, we have introduced two techniques. The first is to combine translation target language models and translation source statistics of OOVs using the hierarchical language model. The second is to automatically generate recognition target pronunciations from original pronunciations by syllable-to-syllable mapping. To confirm the validity of this recognition scheme, we have conducted speech recognition experiments using English speech including Japanese personal names as OOV words. The proposed method outperformed the existing algorithm using a lexicon consisting of all the words in the training set. Surprisingly, it achieved better OOV recognition results than the non-OOV condition where all the proper names in the test set are registered in the lexicon.
引用
收藏
页码:1870 / +
页数:2
相关论文
共 50 条
  • [11] Finding Recurrent Out-of-Vocabulary Words
    Qin, Long
    Rudnicky, Alexander
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2241 - 2245
  • [12] Class-Based N-Gram Language Model for New Words Using Out-of-Vocabulary to In-Vocabulary Similarity
    Naptali, Welly
    Tsuchiya, Masatoshi
    Nakagawa, Seiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (09) : 2308 - 2317
  • [13] A two-pass approach for handling out-of-vocabulary words in a large vocabulary recognition task
    Scharenborg, Odette
    Seneff, Stephanie
    Boves, Lou
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (01): : 206 - 218
  • [14] Rejection of out-of-vocabulary words using phoneme confidence likelihood
    Jitsuhiro, T
    Takahashi, S
    Aikawa, K
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 217 - 220
  • [15] Lexicon Stratification for Translating Out-of-Vocabulary Words
    Tsvetkov, Yulia
    Dyer, Chris
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 125 - 131
  • [16] Out-Of-Vocabulary Words Recognition Based on Conditional Random Field in Electronic Commerce
    Yang, Yanfeng
    Yang, Yanqin
    Guan, Hu
    Xu, Wenchao
    NEURAL INFORMATION PROCESSING (ICONIP 2014), PT II, 2014, 8835 : 532 - 539
  • [17] A phoneme-based approach for eliminating out-of-vocabulary problem of Turkish speech recognition using Hidden Markov Model
    Yavuz, Erdem
    Topuz, Vedat
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2018, 33 (06): : 429 - 445
  • [18] USING SYNTHETIC AUDIO TO IMPROVE THE RECOGNITION OF OUT-OF-VOCABULARY WORDS IN END-TO-END ASR SYSTEMS
    Zheng, Xianrui
    Liu, Yulan
    Gunceler, Deniz
    Willett, Daniel
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5674 - 5678
  • [19] FastContext: Handling Out-of-Vocabulary Words Using the Word Structure and Context
    Silva, Renato M.
    Lochter, Johannes, V
    Almeida, Tiago A.
    Yamakami, Akebo
    INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 539 - 557
  • [20] Handling Out-of-Vocabulary Words in Lexicons to Polarity Classification
    Nascimento, Gabriel
    Duarte, Fellipe
    Guedes, Gustavo Paiva
    PROCEEDINGS OF THE 17TH BRAZILIAN SYMPOSIUM ON HUMAN FACTORS IN COMPUTING SYSTEMS (IHC 2018), 2015,