Hierarchical Pitman-Yor and Dirichlet Process for Language Model

被引:0
|
作者
Chien, Jen-Tzung [1 ]
Chang, Ying-Lan [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu 30010, Taiwan
关键词
language model; backoff model; topic model; Bayesian learning; INFORMATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a nonparametric interpretation for modem language model based on the hierarchical Pitman-Yor and Dirichlet (HPYD) process. We propose the HPYD language model (HPYD-LM) which flexibly conducts backoff smoothing and topic clustering through Bayesian nonparametric learning. The nonparametric priors of backoff n-grams and latent topics are tightly coupled in a compound process. A hybrid probability measure is drawn to build the smoothed topic-based LM. The model structure is automatically determined from training data. A new Chinese restaurant scenario is proposed to implement HPYD-LM via Gibbs sampling. This process reflects the power-law property and extracts the semantic topics from natural language. The superiority of HPYD-LM to the related LMs is demonstrated by the experiments on different corpora in terms of perplexity and word error rate.
引用
收藏
页码:2211 / 2215
页数:5
相关论文
共 50 条
  • [1] Hierarchical Pitman-Yor Language Model for Information Retrieval
    Momtazi, Saeedeh
    Klakow, Dietrich
    [J]. SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 793 - 794
  • [2] A Hierarchical Pitman-Yor mixture of Scaled Dirichlet Distributions
    Baghdadi, Ali
    Manouchehri, Narges
    Bouguila, Nizar
    [J]. 2022 IEEE 31ST INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2022, : 168 - 173
  • [3] A Hierarchical Bayesian Language Model based on Pitman-Yor Processes
    Teh, Yee Whye
    [J]. COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 985 - 992
  • [4] A Parallel Training Algorithm for Hierarchical Pitman-Yor Process Language Models
    Huang, Songfang
    Renals, Steve
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2663 - 2666
  • [5] Hierarchical Pitman-Yor-Dirichlet Language Model
    Chien, Jen-Tzung
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (08) : 1259 - 1272
  • [6] Hierarchical Pitman-Yor language models for ASR in meetings
    Huang, Songfang
    Renals, Steve
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 124 - 129
  • [7] DYNAMIC TEXTURES CLUSTERING USING A HIERARCHICAL PITMAN-YOR PROCESS MIXTURE OF DIRICHLET DISTRIBUTIONS
    Fan, Wentao
    Bouguila, Nizar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 296 - 300
  • [8] Perfect Sampling of the Posterior in the Hierarchical Pitman-Yor Process
    Bacallado, Sergio
    Favaro, Stefano
    Power, Samuel
    Trippa, Lorenzo
    [J]. BAYESIAN ANALYSIS, 2022, 17 (03): : 685 - 709
  • [9] Hierarchical Dirichlet and Pitman-Yor process mixtures of shifted-scaled Dirichlet distributions for proportional data modeling
    Baghdadi, Ali
    Manouchehri, Narges
    Patterson, Zachary
    Fan, Wentao
    Bouguila, Nizar
    [J]. COMPUTATIONAL INTELLIGENCE, 2022, 38 (06) : 2095 - 2115
  • [10] Supervised hierarchical Pitman-Yor process for natural scene segmentation
    Shyr, Alex
    Darrell, Trevor
    Jordan, Michael
    Urtasun, Raquel
    [J]. 2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011,