Provable Algorithms for Inference in Topic Models

被引:0
|
作者
Arora, Sanjeev [1 ]
Ge, Rong [2 ]
Koehler, Frederic [3 ]
Ma, Tengyu [1 ]
Moitra, Ankur [4 ,5 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
[2] Duke Univ, Comp Sci Dept, Durham, NC 27706 USA
[3] Princeton Univ, Dept Math, Princeton, NJ 08544 USA
[4] MIT, Dept Math, Cambridge, MA 02139 USA
[5] MIT, CSAIL, Cambridge, MA 02139 USA
关键词
LASSO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there has been considerable progress on designing algorithms with provable guarantees - typically using linear algebraic methods - for parameter learning in latent variable models. But designing provable algorithms for inference has proven to be more challenging. Here we take a first step towards provable inference in topic models. We leverage a property of topic models that enables us to construct simple linear estimators for the unknown topic proportions that have small variance, and consequently can work with short documents. Our estimators also correspond to finding an estimate around which the posterior is well-concentrated. We show lower bounds that for shorter documents it can be information theoretically impossible to find the hidden topics. Finally, we give empirical results that demonstrate that our algorithm works on realistic topic models. It yields good solutions on synthetic data and runs in time comparable to a single iteration of Gibbs sampling.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Deterministic Inference of Topic Models via Maximal Latent State Replication
    Rugeles, Daniel
    Hai, Zhen
    Dash, Manoranjan
    Cong, Gao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) : 1684 - 1695
  • [22] A comparison of algorithms for inference and learning in probabilistic graphical models
    Frey, BJ
    Jojic, N
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (09) : 1392 - 1416
  • [23] Linear response algorithms for approximate inference in graphical models
    Welling, M
    Teh, YW
    NEURAL COMPUTATION, 2004, 16 (01) : 197 - 221
  • [24] Parallel algorithms for Bayesian inference in spatial Gaussian models
    Whiley, M
    Wilson, SP
    COMPSTAT 2002: PROCEEDINGS IN COMPUTATIONAL STATISTICS, 2002, : 485 - 490
  • [25] Sparse Topic Modeling: Computational Efficiency, Near-Optimal Algorithms, and Statistical Inference
    Wu, Ruijia
    Zhang, Linjun
    Cai, T. Tony
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (543) : 1849 - 1861
  • [26] Two time-efficient gibbs sampling inference algorithms for biterm topic model
    Zhou, Xiaotang
    Ouyang, Jihong
    Li, Ximing
    APPLIED INTELLIGENCE, 2018, 48 (03) : 730 - 754
  • [27] Two time-efficient gibbs sampling inference algorithms for biterm topic model
    Xiaotang Zhou
    Jihong Ouyang
    Ximing Li
    Applied Intelligence, 2018, 48 : 730 - 754
  • [28] Detecting polarizing language in Twitter using topic models and ML algorithms
    Gitari N.D.
    Zuping Z.
    Herman W.
    Gitari, Njagi Dennis (gitaden2000@yahoo.com), 1600, Science and Engineering Research Support Society (09): : 211 - 222
  • [29] Invariant Inference with Provable Complexity from the Monotone Theory
    Feldman, Yotam M. Y.
    Shoham, Sharon
    STATIC ANALYSIS, SAS 2022, 2022, 13790 : 201 - 226
  • [30] Algorithms for Solution Inference Based on Unified Logical Control Models
    Litvinenko, A.
    CYBERNETICS AND SYSTEMS ANALYSIS, 2020, 56 (02) : 187 - 194