Provable Algorithms for Inference in Topic Models

被引:0
|
作者
Arora, Sanjeev [1 ]
Ge, Rong [2 ]
Koehler, Frederic [3 ]
Ma, Tengyu [1 ]
Moitra, Ankur [4 ,5 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
[2] Duke Univ, Comp Sci Dept, Durham, NC 27706 USA
[3] Princeton Univ, Dept Math, Princeton, NJ 08544 USA
[4] MIT, Dept Math, Cambridge, MA 02139 USA
[5] MIT, CSAIL, Cambridge, MA 02139 USA
关键词
LASSO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there has been considerable progress on designing algorithms with provable guarantees - typically using linear algebraic methods - for parameter learning in latent variable models. But designing provable algorithms for inference has proven to be more challenging. Here we take a first step towards provable inference in topic models. We leverage a property of topic models that enables us to construct simple linear estimators for the unknown topic proportions that have small variance, and consequently can work with short documents. Our estimators also correspond to finding an estimate around which the posterior is well-concentrated. We show lower bounds that for shorter documents it can be information theoretically impossible to find the hidden topics. Finally, we give empirical results that demonstrate that our algorithm works on realistic topic models. It yields good solutions on synthetic data and runs in time comparable to a single iteration of Gibbs sampling.
引用
收藏
页数:9
相关论文
共 50 条
  • [11] Efficient Distributed Topic Modeling with Provable Guarantees
    Ding, Weicong
    Rohban, Mohammad H.
    Ishwar, Prakash
    Saligrama, Venkatesh
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 33, 2014, 33 : 167 - 175
  • [12] Protein Design by Provable Algorithms
    Hallen, Mark A.
    Donald, Bruce R.
    COMMUNICATIONS OF THE ACM, 2019, 62 (10) : 76 - 84
  • [13] Prior-aware Composition Inference for Spectral Topic Models
    Lee, Moontae
    Bindel, David
    Mimno, David
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 4258 - 4267
  • [14] On some provably correct cases of variational inference for topic models
    Awasthi, Pranjal
    Risteski, Andrej
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [15] Scalable Collapsed Inference for High-Dimensional Topic Models
    Islam, Rashidul
    Foulds, James
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2836 - 2845
  • [16] Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models
    Magnusson, Mans
    Jonsson, Leif
    Villani, Mattias
    Broman, David
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (02) : 449 - 463
  • [17] A study on the application of topic models to motif finding algorithms
    Josep Basha Gutierrez
    Kenta Nakai
    BMC Bioinformatics, 17
  • [18] A study on the application of topic models to motif finding algorithms
    Gutierrez, Josep Basha
    Nakai, Kenta
    BMC BIOINFORMATICS, 2016, 17
  • [19] Distributed algorithms for topic models published 8/09
    Newman, David
    Asuncion, Arthur
    Smyth, Padhraic
    Welling, Max
    Journal of Machine Learning Research, 2009, 10 : 1801 - 1828
  • [20] Scalable inference of topic evolution via models for latent geometric structures
    Yurochkin, Mikhail
    Fan, Zhiwei
    Guha, Aritra
    Koutris, Paraschos
    Nguyen, XuanLong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32