POWER LAW DISCOUNTING FOR N-GRAM LANGUAGE MODELS

被引:2
|
作者
Huang, Songfang [1 ]
Renals, Steve [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
关键词
language model; smoothing; absolute discount; Kneser-Ney; Bayesian; Pitman-Yor; power law;
D O I
10.1109/ICASSP.2010.5495007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present an approximation to the Bayesian hierarchical Pitman-Yor process language model which maintains the power law distribution over word tokens, while not requiring a computationally expensive approximate inference process. This approximation, which we term power law discounting, has a similar computational complexity to interpolated and modified Kneser-Ney smoothing. We performed experiments on meeting transcription using the NIST RT06s evaluation data and the AMI corpus, with a vocabulary of 50,000 words and a language model training set of up to 211 million words. Our results indicate that power law discounting results in statistically significant reductions in perplexity and word error rate compared to both interpolated and modified Kneser-Ney smoothing, while producing similar results to the hierarchical Pitman-Yor process language model.
引用
收藏
页码:5178 / 5181
页数:4
相关论文
共 50 条
  • [21] Modeling actions of PubMed users with n-gram language models
    Lin, Jimmy
    Wilbur, W. John
    [J]. INFORMATION RETRIEVAL, 2009, 12 (04): : 487 - 503
  • [22] Modeling actions of PubMed users with n-gram language models
    Jimmy Lin
    W. John Wilbur
    [J]. Information Retrieval, 2009, 12 : 487 - 503
  • [23] Constructing n-gram rules for natural language models through exploring the limitation of the Zipf–Mandelbrot law
    Harry M. Chang
    [J]. Computing, 2011, 91 : 241 - 264
  • [24] Discriminative n-gram language modeling
    Roark, Brian
    Saraclar, Murat
    Collins, Michael
    [J]. COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 373 - 392
  • [25] Croatian Language N-Gram System
    Dembitz, Sandor
    Blaskovic, Bruno
    Gledec, Gordan
    [J]. ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, 2012, 243 : 696 - 705
  • [26] Similar N-gram Language Model
    Gillot, Christian
    Cerisara, Christophe
    Langlois, David
    Haton, Jean-Paul
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1824 - 1827
  • [27] PERFORMANCE ANALYSIS OF NEURAL NETWORKS IN COMBINATION WITH N-GRAM LANGUAGE MODELS
    Oparin, Ilya
    Sundermeyer, Martin
    Ney, Hermann
    Gauvain, Jean-Luc
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5005 - 5008
  • [28] Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition
    Tueske, Zoltan
    Schlueter, Ralf
    Ney, Hermann
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3358 - 3362
  • [29] Combining naive Bayes and n-gram language models for text classification
    Peng, FC
    Schuurmans, D
    [J]. ADVANCES IN INFORMATION RETRIEVAL, 2003, 2633 : 335 - 350
  • [30] Factored bilingual n-gram language models for statistical machine translation
    Crego, Josep M.
    Yvon, Francois
    [J]. MACHINE TRANSLATION, 2010, 24 (02) : 159 - 175