Language Model Adaptation Using Latent Dirichlet Allocation and an Efficient Topic Inference Algorithm

被引:0
|
作者
Heidel, Aaron [1 ]
Chang, Hung-an
Lee, Lin-shan [1 ]
机构
[1] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, Taipei 10764, Taiwan
关键词
language model; unsupervised adaptation; topic modeling; speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an effort to perform topic mixture-based language model adaptation using latent Dirichlet allocation (LDA). We use probabilistic latent semantic analysis (PLSA) to automatically cluster a heterogeneous training corpus, and train an LDA model using the resultant topic-document assignments. Using this LDA model, we then construct topic-specific corpora at the utterance level for interpolation with a background language model during language model adaptation. We also present a novel iterative algorithm for LDA topic inference. Very encouraging results were obtained in preliminary experiments with broadcast news in Mandarin Chinese.
引用
收藏
页码:1145 / +
页数:2
相关论文
共 50 条
  • [1] Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation
    Jeon, Hyung-Bae
    Lee, Soo-Young
    [J]. ETRI JOURNAL, 2016, 38 (03) : 487 - 493
  • [2] Robust topic inference for latent semantic language model adaptation
    Heidel, Aaron
    Lee, Lin-shan
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 177 - 182
  • [3] UNSUPERVISED LANGUAGE MODEL ADAPTATION USING LATENT DIRICHLET ALLOCATION AND DYNAMIC MARGINALS
    Haidar, Md. Akmal
    O'Shaughnessy, Douglas
    [J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1480 - 1484
  • [4] Novel Weighting Scheme for Unsupervised Language Model Adaptation Using Latent Dirichlet Allocation
    Haidar, Md Akmal
    O'Shaughnessy, Douglas
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2438 - 2441
  • [5] An Online Inference Algorithm for Labeled Latent Dirichlet Allocation
    Zhou, Qiang
    Huang, Heyan
    Mao, Xian-Ling
    [J]. WEB TECHNOLOGIES AND APPLICATIONS (APWEB 2015), 2015, 9313 : 17 - 28
  • [6] A Fast Algorithm for Posterior Inference with Latent Dirichlet Allocation
    Bui Thi-Thanh-Xuan
    Vu Van-Tu
    Takasu, Atsuhiro
    Khoat Than
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2018, PT II, 2018, 10752 : 137 - 146
  • [7] Topic Modeling Using Latent Dirichlet allocation: A Survey
    Chauhan, Uttam
    Shah, Apurva
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (07)
  • [8] Using Latent Dirichlet Allocation for Topic Modelling in Twitter
    Ostrowski, David Alfred
    [J]. 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 493 - 497
  • [9] Topic Model Allocation of Conversational Dialogue Records by Latent Dirichlet Allocation
    Yeh, Jui-Feng
    Lee, Chen-Hsien
    Tan, Yi-Shiuan
    Yu, Liang-Chih
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [10] Research Topic Analysis in Engineering Management Using a Latent Dirichlet Allocation Model
    Kim, Jin Ho
    Chen, Weiru
    [J]. JOURNAL OF INDUSTRIAL INTEGRATION AND MANAGEMENT-INNOVATION AND ENTREPRENEURSHIP, 2018, 3 (04):