SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL

被引:0
|
作者
Yamamoto, Hirofumi [1 ]
Kikui, Genichiro [2 ]
Nakamura, Satoshi [1 ,2 ]
Sagisaka, Yoshinori [1 ,3 ]
机构
[1] Natl Inst Informat & Commun Technol, 2-2-2 Hikaridai, Seika, Kyoto, Japan
[2] ATR Spoken Language Commun Res Labs, Kyoto, Japan
[3] Waseda Univ, GITI, Tokyo, Japan
关键词
Speech Recognition; Language model; Foreign word; Out-of-Vocabulalry word; Hierarchical language model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new speech recognition scheme for foreign out-of-vocabulary words embedded in native-language speech. To recognize foreign names frequently observed in news speech or in translation speech, we adopted a hierarchical language model that had been successfully applied to OOV words covering native vocabularies. In this hierarchical language model, OOV vocabularies are modeled as a word-class model in the upper-layered model, and their statistical phonotactic constraints are modeled in the lower-layered model. Since extra statistics are needed to cover foreign words and their pronunciation differences, we have introduced two techniques. The first is to combine translation target language models and translation source statistics of OOVs using the hierarchical language model. The second is to automatically generate recognition target pronunciations from original pronunciations by syllable-to-syllable mapping. To confirm the validity of this recognition scheme, we have conducted speech recognition experiments using English speech including Japanese personal names as OOV words. The proposed method outperformed the existing algorithm using a lexicon consisting of all the words in the training set. Surprisingly, it achieved better OOV recognition results than the non-OOV condition where all the proper names in the test set are registered in the lexicon.
引用
收藏
页码:1870 / +
页数:2
相关论文
共 50 条
  • [21] WASSUP? LOL : Characterizing Out-of-Vocabulary Words in Twitter
    Maity, Suman Kalyan
    Chaudhary, Anshit
    Kumar, Shraman
    Mukherjee, Animesh
    Sarda, Chaitanya
    Patil, Abhijeet
    Mondal, Akash
    PROCEEDINGS OF THE 19TH ACM CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING COMPANION, 2016, : 341 - 344
  • [22] COPING WITH OUT-OF-VOCABULARY WORDS: OPEN VERSUS HUGE VOCABULARY ASR
    Gerosa, Matteo
    Federico, Marcello
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4313 - 4316
  • [23] Detection of Out-of-Vocabulary Words in Posterior Based ASR
    Ketabdar, Hamed
    Hannemann, Mirko
    Hermansky, Hynek
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2772 - 2775
  • [24] Similarity Scoring for Recognizing Repeated Out-of-Vocabulary Words
    Hannemann, Mirko
    Kombrink, Stefan
    Karafiat, Martin
    Burget, Lukas
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 897 - 900
  • [25] Chinese Word Segmentation and Out-Of-Vocabulary Words Detection Using Suffix Array
    Ji Wenyan
    Peng Tao
    Zuo Wanli
    He Fengling
    Zhu Huifeng
    WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 56 - 60
  • [26] A Spoken Term Detection Framework for Recovering Out-of-Vocabulary Words Using the Web
    Parada, Carolina
    Sethy, Abhinav
    Dredze, Mark
    Jelinek, Frederick
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1269 - +
  • [27] Robust out-of-vocabulary rejection for low-complexity speaker independent speech recognition
    Broun, CC
    Campbell, WM
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1811 - 1814
  • [28] OUT-OF-VOCABULARY WORD DETECTION IN A SPEECH-TO-SPEECH TRANSLATION SYSTEM
    Kuo, Hong-Kwang
    Kislal, Ellen Eide
    Mangu, Lidia
    Soltau, Hagen
    Beran, Tomas
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [29] Impact of Out-of-Vocabulary Words on the Twitter Experience of Blind Users
    Lee, Hae-Na
    Ashok, Vikas
    PROCEEDINGS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI' 22), 2022,
  • [30] Improving Abstractive Summarization by Training Masked Out-of-Vocabulary Words
    Lee, Tae-Seok
    Lee, Hyun-Young
    Kang, Seung-Shik
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2022, 18 (03): : 344 - 358