A category based approach for recognition of out-of-vocabulary words

被引:0
|
作者
Gallwitz, F
Noth, E
Niemann, H
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a significant amount of out-of-vocabulary words even when the vocabulary size is very large. In this paper we present a new approach for the integration of out-of-vocabulary words into statistical language models. We use category information for all words in the training corpus to define a function that gives an approximation of the out-of-vocabulary word emission probability for each word category. This information is integrated into the language models. Although we use a simple acoustic model for out-of-vocabulary words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate.
引用
收藏
页码:228 / 231
页数:4
相关论文
共 50 条
  • [31] FastContext: Handling Out-of-Vocabulary Words Using the Word Structure and Context
    Silva, Renato M.
    Lochter, Johannes, V
    Almeida, Tiago A.
    Yamakami, Akebo
    INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 539 - 557
  • [32] Bilingual Character Representation for Efficiently Addressing Out-of-Vocabulary Words in Code-Switching Named Entity Recognition
    Winata, Genta Indra
    Wu, Chien-Sheng
    Madotto, Andrea
    Fung, Pascale
    COMPUTATIONAL APPROACHES TO LINGUISTIC CODE-SWITCHING, 2018, : 110 - 114
  • [33] Exploring Edit Distance for Normalising Out-of-Vocabulary Malay Words on Social Media
    Athirah, Raja Roza
    Soon, Lay-Ki
    Haw, Su-Cheng
    ENGINEERING APPLICATION OF ARTIFICIAL INTELLIGENCE CONFERENCE 2018 (EAAIC 2018), 2019, 255
  • [34] Chinese Word Segmentation and Out-Of-Vocabulary Words Detection Using Suffix Array
    Ji Wenyan
    Peng Tao
    Zuo Wanli
    He Fengling
    Zhu Huifeng
    WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 56 - 60
  • [35] A Spoken Term Detection Framework for Recovering Out-of-Vocabulary Words Using the Web
    Parada, Carolina
    Sethy, Abhinav
    Dredze, Mark
    Jelinek, Frederick
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1269 - +
  • [36] Out-of-vocabulary rejection based on selective attention model
    Park, KY
    Lee, SY
    NEURAL PROCESSING LETTERS, 2000, 12 (01) : 41 - 48
  • [37] A phoneme-based approach for eliminating out-of-vocabulary problem of Turkish speech recognition using Hidden Markov Model
    Yavuz, Erdem
    Topuz, Vedat
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2018, 33 (06): : 429 - 445
  • [38] Out-of-Vocabulary Rejection based on Selective Attention Model
    Ki-Young Park
    Soo-Young Lee
    Neural Processing Letters, 2000, 12 : 41 - 48
  • [39] Out-of-vocabulary word recognition with a hierarchical doubly Markov language model
    Kokubo, H
    Yamamoto, H
    Ogawa, Y
    Sagisaka, Y
    Kikui, G
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 543 - 547
  • [40] Dynamic out-of-vocabulary word registration to language model for speech recognition
    Kitaoka, Norihide
    Chen, Bohan
    Obashi, Yuya
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)