Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition

被引:0
|
作者
Yazgan, A [1 ]
Saraclar, M [1 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a method for out-of-vocabulary (OOV) word detection and taking a step toward open vocabulary automatic speech recognition. The proposed method uses a hybrid language model combining words and subword units such as phones or syllables. We describe a detection algorithm based on the posterior count of the OOV words given the hybrid model, and compare it to using the posterior probability of the best word string given a conventional word only model. Experimental results on the Switchboard corpus are presented for different vocabulary sizes. The new method yields a gain of over 10% in OOV word detection. In addition, a modest number of the OOV word pronunciations are found correctly.
引用
收藏
页码:745 / 748
页数:4
相关论文
共 50 条
  • [31] Boosting HMM acoustic models in large vocabulary speech recognition
    Meyer, C
    Schramm, H
    [J]. SPEECH COMMUNICATION, 2006, 48 (05) : 532 - 548
  • [32] Large vocabulary speech recognition in French
    Adda-Decker, M
    Adda, G
    Gauvain, JL
    Lamel, L
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 45 - 48
  • [33] Advances in Large Vocabulary Speech Recognition
    Gauvain, JL
    De Mori, R
    Lamel, L
    [J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01): : 1 - 3
  • [34] Out of vocabulary word detection and recovery in Arabic handwritten text recognition
    Jemni, Sana Khamekhem
    Kessentini, Yousri
    Kanoun, Slim
    [J]. PATTERN RECOGNITION, 2019, 93 : 507 - 520
  • [35] Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages
    Li, Xin
    Pan, Jielin
    Zhao, Qingwei
    Yan, Yonghong
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (11): : 2478 - 2482
  • [36] Out-of-vocabulary word recognition using a hierarchical language model based on multiple Markov models
    Yamamoto, H
    Kokubo, H
    Kikui, G
    Ogawa, Y
    Sagisaka, Y
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (12): : 55 - 64
  • [37] A unified language model for large vocabulary continuous speech recognition of Turkish
    Arisoy, Ebru
    Dutagaci, Helin
    Arslan, Levent M.
    [J]. SIGNAL PROCESSING, 2006, 86 (10) : 2844 - 2862
  • [38] End-to-End Large Vocabulary Speech Recognition for the Serbian Language
    Popovic, Branislav
    Pakoci, Edvin
    Pekar, Darko
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 343 - 352
  • [39] Spoken language identification using large vocabulary speech recognition.
    Hieronymus, JL
    Kadambe, S
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1780 - 1783
  • [40] Unlimited vocabulary speech recognition with morph language models applied to Finnish
    Hirsimaki, Teemu
    Creutz, Mathias
    Siivola, Vesa
    Kurimo, Mikko
    Virpioja, Sami
    Pylkkonen, Janne
    [J]. COMPUTER SPEECH AND LANGUAGE, 2006, 20 (04): : 515 - 541