Dynamic out-of-vocabulary word registration to language model for speech recognition

被引:3
|
作者
Kitaoka, Norihide [1 ]
Chen, Bohan [2 ]
Obashi, Yuya [3 ]
机构
[1] Toyohashi Univ Technol, 1-1 Hibarigaoka Tempaku Cho, Toyohashi, Aichi, Japan
[2] Nagoya Univ, Chikusa Ku, 1 Furo Cho, Nagoya, Aichi, Japan
[3] Tokushima Univ, 2-1 Minamijohsanjima Cho, Tokushima, Japan
关键词
Speech recognition; Out-of-vocabulary words; OOV registration; Language model;
D O I
10.1186/s13636-020-00193-1
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a method of dynamically registering out-of-vocabulary (OOV) words by assigning the pronunciations of these words to pre-inserted OOV tokens, editing the pronunciations of the tokens. To do this, we add OOV tokens to an additional, partial copy of our corpus, either randomly or to part-of-speech (POS) tags in the selected utterances, when training the language model (LM) for speech recognition. This results in an LM containing OOV tokens, to which we can assign pronunciations. We also investigate the impact of acoustic complexity and the "natural" occurrence frequency of OOV words on the recognition of registered OOV words. The proposed OOV word registration method is evaluated using two modern automatic speech recognition (ASR) systems, Julius and Kaldi, using DNN-HMM acoustic models and N-gram language models (plus an additional evaluation using RNN re-scoring with Kaldi). Our experimental results show that when using the proposed OOV registration method, modern ASR systems can recognize OOV words without re-training the language model, that the acoustic complexity of OOV words affects OOV recognition, and that differences between the "natural" and the assigned occurrence frequencies of OOV words have little impact on the final recognition results.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Dynamic out-of-vocabulary word registration to language model for speech recognition
    Norihide Kitaoka
    Bohan Chen
    Yuya Obashi
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [2] Out-of-vocabulary word recognition with a hierarchical doubly Markov language model
    Kokubo, H
    Yamamoto, H
    Ogawa, Y
    Sagisaka, Y
    Kikui, G
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 543 - 547
  • [3] SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL
    Yamamoto, Hirofumi
    Kikui, Genichiro
    Nakamura, Satoshi
    Sagisaka, Yoshinori
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1870 - +
  • [4] Out-of-vocabulary word recognition using a hierarchical language model based on multiple Markov models
    Yamamoto, H
    Kokubo, H
    Kikui, G
    Ogawa, Y
    Sagisaka, Y
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (12): : 55 - 64
  • [5] Out-of-vocabulary word rejection algorithm in Korean variable vocabulary word recognition
    Moon, KS
    Kim, YJ
    Kim, HR
    Chung, JH
    [J]. ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 53 - 56
  • [6] Using the Web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition
    Oprean, Cristina
    Likforman-Sulem, Laurence
    Popescu, Adrian
    Mokbel, Chafic
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 989 - 993
  • [7] OUT-OF-VOCABULARY WORD DETECTION IN A SPEECH-TO-SPEECH TRANSLATION SYSTEM
    Kuo, Hong-Kwang
    Kislal, Ellen Eide
    Mangu, Lidia
    Soltau, Hagen
    Beran, Tomas
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition
    Reveil, Bert
    Demuynck, Kris
    Martens, Jean-Pierre
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 141 - 162
  • [9] Out-of-Vocabulary Word Detection and Beyond
    Kombrink, Stefan
    Hannemann, Mirko
    Burget, Lukas
    [J]. DETECTION AND IDENTIFICATION OF RARE AUDIOVISUAL CUES, 2012, 384 : 57 - 65
  • [10] RNN Language Model Estimation for Out-of-Vocabulary Words
    Illina, Irina
    Fohr, Dominique
    [J]. HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017, 2020, 12598 : 199 - 211