A category based approach for recognition of out-of-vocabulary words

被引:0
|
作者
Gallwitz, F
Noth, E
Niemann, H
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a significant amount of out-of-vocabulary words even when the vocabulary size is very large. In this paper we present a new approach for the integration of out-of-vocabulary words into statistical language models. We use category information for all words in the training corpus to define a function that gives an approximation of the out-of-vocabulary word emission probability for each word category. This information is integrated into the language models. Although we use a simple acoustic model for out-of-vocabulary words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate.
引用
收藏
页码:228 / 231
页数:4
相关论文
共 50 条
  • [41] Dynamic out-of-vocabulary word registration to language model for speech recognition
    Norihide Kitaoka
    Bohan Chen
    Yuya Obashi
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [42] Class-Based N-Gram Language Model for New Words Using Out-of-Vocabulary to In-Vocabulary Similarity
    Naptali, Welly
    Tsuchiya, Masatoshi
    Nakagawa, Seiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (09) : 2308 - 2317
  • [43] Out-of-vocabulary word recognition using a hierarchical language model based on multiple Markov models
    Yamamoto, H
    Kokubo, H
    Kikui, G
    Ogawa, Y
    Sagisaka, Y
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (12): : 55 - 64
  • [44] Out-of-Vocabulary Word Detection and Beyond
    Kombrink, Stefan
    Hannemann, Mirko
    Burget, Lukas
    DETECTION AND IDENTIFICATION OF RARE AUDIOVISUAL CUES, 2012, 384 : 57 - 65
  • [45] Improving out-of-vocabulary name resolution
    Palmer, DD
    Ostendorf, M
    COMPUTER SPEECH AND LANGUAGE, 2005, 19 (01): : 107 - 128
  • [46] Incorporate web search technology to solve out-of-vocabulary words in Chinese word segmentation
    Qiao, Wei
    Sun, Maosong
    PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, 2009, 2 : 454 - 463
  • [47] Single-class Support Vector Machine for an Out-of-Vocabulary Rejection of Isolated Words
    He, Dongzhi
    Hou, Yibin
    Huang, Zhangqin
    Ding, Zhihao
    2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2009), VOLS 1-4, 2009, : 1376 - 1380
  • [48] Using the Web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition
    Oprean, Cristina
    Likforman-Sulem, Laurence
    Popescu, Adrian
    Mokbel, Chafic
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 989 - 993
  • [49] Recurrent out-of-vocabulary word detection based on distribution of features
    Asami, Taichi
    Masumura, Ryo
    Aono, Yushi
    Shinoda, Koichi
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 247 - 259
  • [50] A Hybrid Approach for Automatic Text Summarization by Handling Out-of-Vocabulary Words Using TextR-BLG Pointer Algorithm
    Mhatre, Sonali
    Ragha, Lata L.
    SCIENTIFIC AND TECHNICAL INFORMATION PROCESSING, 2024, 51 (01) : 72 - 83