Dynamic out-of-vocabulary word registration to language model for speech recognition

被引：3

作者：

Kitaoka, Norihide ^{[1
]}

Chen, Bohan ^{[2
]}

Obashi, Yuya ^{[3
]}

机构：

[1] Toyohashi Univ Technol, 1-1 Hibarigaoka Tempaku Cho, Toyohashi, Aichi, Japan

[2] Nagoya Univ, Chikusa Ku, 1 Furo Cho, Nagoya, Aichi, Japan

[3] Tokushima Univ, 2-1 Minamijohsanjima Cho, Tokushima, Japan

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2021年 / 2021卷 / 01期

关键词：

Speech recognition; Out-of-vocabulary words; OOV registration; Language model;

D O I：

10.1186/s13636-020-00193-1

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose a method of dynamically registering out-of-vocabulary (OOV) words by assigning the pronunciations of these words to pre-inserted OOV tokens, editing the pronunciations of the tokens. To do this, we add OOV tokens to an additional, partial copy of our corpus, either randomly or to part-of-speech (POS) tags in the selected utterances, when training the language model (LM) for speech recognition. This results in an LM containing OOV tokens, to which we can assign pronunciations. We also investigate the impact of acoustic complexity and the "natural" occurrence frequency of OOV words on the recognition of registered OOV words. The proposed OOV word registration method is evaluated using two modern automatic speech recognition (ASR) systems, Julius and Kaldi, using DNN-HMM acoustic models and N-gram language models (plus an additional evaluation using RNN re-scoring with Kaldi). Our experimental results show that when using the proposed OOV registration method, modern ASR systems can recognize OOV words without re-training the language model, that the acoustic complexity of OOV words affects OOV recognition, and that differences between the "natural" and the assigned occurrence frequencies of OOV words have little impact on the final recognition results.

引用

页数：8

共 50 条

[1] Dynamic out-of-vocabulary word registration to language model for speech recognition
Norihide Kitaoka
Bohan Chen
Yuya Obashi
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
[2] Out-of-vocabulary word recognition with a hierarchical doubly Markov language model
Kokubo, H
Yamamoto, H
Ogawa, Y
Sagisaka, Y
Kikui, G
[J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 543 - 547
[3] SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL
Yamamoto, Hirofumi
Kikui, Genichiro
Nakamura, Satoshi
Sagisaka, Yoshinori
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1870 - +
[4] Out-of-vocabulary word recognition using a hierarchical language model based on multiple Markov models
Yamamoto, H
Kokubo, H
Kikui, G
Ogawa, Y
Sagisaka, Y
[J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (12): : 55 - 64
[5] Out-of-vocabulary word rejection algorithm in Korean variable vocabulary word recognition
Moon, KS
Kim, YJ
Kim, HR
Chung, JH
[J]. ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 53 - 56
[6] Using the Web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition
Oprean, Cristina
Likforman-Sulem, Laurence
Popescu, Adrian
Mokbel, Chafic
[J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 989 - 993
[7] OUT-OF-VOCABULARY WORD DETECTION IN A SPEECH-TO-SPEECH TRANSLATION SYSTEM
Kuo, Hong-Kwang
Kislal, Ellen Eide
Mangu, Lidia
Soltau, Hagen
Beran, Tomas
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[8] An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition
Reveil, Bert
Demuynck, Kris
Martens, Jean-Pierre
[J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 141 - 162
[9] Out-of-Vocabulary Word Detection and Beyond
Kombrink, Stefan
Hannemann, Mirko
Burget, Lukas
[J]. DETECTION AND IDENTIFICATION OF RARE AUDIOVISUAL CUES, 2012, 384 : 57 - 65
[10] RNN Language Model Estimation for Out-of-Vocabulary Words
Illina, Irina
Fohr, Dominique
[J]. HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017, 2020, 12598 : 199 - 211

← 1 2 3 4 5 →