Joint Unsupervised Adaptation of N-gram and RNN Language Models via LDA-based Hybrid Mixture Modeling

被引：0

作者：

Masumura, Ryo ^{[1
]}

Asami, Taichi ^{[1
]}

Masataki, Hirokazu ^{[1
]}

Aono, Yushi ^{[1
]}

机构：

[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan

来源：

2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017) | 2017年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper reports an initial study of unsupervised adaptation that assumes simultaneous use of both n-gram and recurrent neural network (RNN) language models (LMs) in automatic speech recognition (ASR). It is known that a combination of n-grams and RNN LMs is a more effective approach to ASR than using each of them singly. However, unsupervised adaptation methods that simultaneously adapt both n-grams and RNN LMs have not been presented while various unsupervised adaptation methods specific to either n-gram LMs or RNN LMs have been examined. In order to handle different LMs in a unified unsupervised adaptation framework, our key idea is to introduce mixture modeling for both n-gram LMs and RNN LMs. The mixture modeling can simultaneously handle multiple LMs and unsupervised adaptation can be easily accomplished merely by adjusting their mixture weights using a recognition hypothesis of an input speech. This paper proposes joint unsupervised adaptation achieved by a hybrid mixture modeling using both n-gram mixture models and RNN mixture models. We present latent Dirichlet allocation based hybrid mixture modeling for effective topic adaptation. Our experiments in lecture ASR tasks show the effectiveness of joint unsupervised adaptation. We also reveal performance in which only one n-gram or RNN LM is adapted.

引用

下载

页码：1538 / 1541

页数：4

共 33 条

[31] Smoothed n-gram based models for tweet language identification: A case study of the Brazilian and European Portuguese national varieties
Castro, Dayvid W.
Souza, Ellen
Vitorio, Douglas
Santos, Diego
Oliveira, Adriano L. I.
APPLIED SOFT COMPUTING, 2017, 61 : 1160 - 1172
[32] Spontaneous speech understanding in train timetable inquiry processing based on N-gram language models and finite state transducers
Jelínek, L
Smídl, L
8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VI, PROCEEDINGS: IMAGE, ACOUSTIC, SIGNAL PROCESSING AND OPTICAL SYSTEMS, TECHNOLOGIES AND APPLICATIONS, 2004, : 444 - 449
[33] Unsupervised acquisition of idiomatic units of symbolic natural language: An n-gram frequency-based approach for the chunking of news articles and tweets (vol 15, e0234214, 2020)
Borrelli, Dario
Svartzman, Gabriela Gongora
Lipizzi, Carlo
PLOS ONE, 2021, 16 (01):

← 1 2 3 4 →