Unsupervised Language Model Adaptation Using Latent Semantic Marginals

被引:0
|
作者
Tam, Yik-Cheung [1 ]
Schultz, Tanja [1 ]
机构
[1] Carnegie Mellon Univ, InterACT, Pittsburgh, PA 15213 USA
关键词
unsupervised LM adaptation; LSA marginals; Latent Dirichlet Allocation; Mandarin Broadcast News;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We integrated the Latent Dirichlet Allocation (LDA) approach, a latent semantic analysis model, into unsupervised language model adaptation framework. We adapted a background language model by minimizing the Kullback-Leibler divergence between the adapted model and the background model subject to a constraint that the marginalized unigram probability distribution of the adapted model is equal to the corresponding distribution estimated by the LDA model - the latent semantic marginals. We evaluated our approach on the RT04 Mandarin Broadcast News test set and experimented with different LM training settings. Results showed that our approach reduces the perplexity and the character error rates using supervised and unsupervised adaptation.
引用
收藏
页码:2206 / 2209
页数:4
相关论文
共 50 条
  • [1] Unsupervised language model adaptation using LDA-based mixture models and latent semantic marginals
    Haidar, Md. Akmal
    O'Shaughnessy, Douglas
    [J]. COMPUTER SPEECH AND LANGUAGE, 2015, 29 (01): : 20 - 31
  • [2] UNSUPERVISED LANGUAGE MODEL ADAPTATION USING LATENT DIRICHLET ALLOCATION AND DYNAMIC MARGINALS
    Haidar, Md. Akmal
    O'Shaughnessy, Douglas
    [J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1480 - 1484
  • [3] Integrating MAP, Marginals, and Unsupervised Language Model Adaptation
    Wang, Wen
    Stolcke, Andreas
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2692 - 2695
  • [4] Correlated latent semantic model for unsupervised LM adaptation
    Tam, Yik-Cheung
    Schultz, Tanja
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 41 - +
  • [5] Novel Weighting Scheme for Unsupervised Language Model Adaptation Using Latent Dirichlet Allocation
    Haidar, Md Akmal
    O'Shaughnessy, Douglas
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2438 - 2441
  • [6] Robust topic inference for latent semantic language model adaptation
    Heidel, Aaron
    Lee, Lin-shan
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 177 - 182
  • [7] LDA-BASED LM ADAPTATION USING LATENT SEMANTIC MARGINALS AND MINIMUM DISCRIMINANT INFORMATION
    Haidar, Md Akmal
    O'Shaughnessy, Douglas
    [J]. 2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2040 - 2044
  • [8] Unsupervised language model adaptation
    Bacchiani, M
    Roark, B
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 224 - 227
  • [9] Latent Space Regularization for Unsupervised Domain Adaptation in Semantic Segmentation
    Barbato, Francesco
    Toldo, Marco
    Michieli, Umberto
    Zanuttigh, Pietro
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2829 - 2839
  • [10] Language model adaptation in Tamil language using cross-lingual latent semantic analysis with document aligned corpora
    Selvam, M.
    Natarajan, A. M.
    [J]. CURRENT SCIENCE, 2010, 98 (07): : 922 - 929