Model Selection for Topic Models via Spectral Decomposition

被引:0
|
作者
Cheng, Dehua [1 ]
He, Xinran [1 ]
Liu, Yan [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90007 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic models have achieved significant successes in analyzing large-scale text corpus. In practical applications, we are always confronted with the challenge of model selection, i.e., how to appropriately set the number of topics. Following the recent advances in topic models via tensor decomposition, we make a first attempt to provide theoretical analysis on model selection in latent Dirichlet allocation. With mild conditions, we derive the upper bound and lower bound on the number of topics given a text collection of finite size. Experimental results demonstrate that our bounds are correct and tight. Furthermore, using Gaussian mixture model as an example, we show that our methodology can be easily generalized to model selection analysis in other latent models.
引用
收藏
页码:183 / 191
页数:9
相关论文
共 50 条
  • [1] Prediction Focused Topic Models via Feature Selection
    Ren, Jason
    Kunes, Russell
    Doshi-Velez, Finale
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 4420 - 4428
  • [2] Optimal designs for Gaussian process models |via spectral decomposition
    Harari, Ofir
    Steinberg, David M.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2014, 154 : 87 - 101
  • [3] Spectral Learning for Supervised Topic Models
    Ren, Yong
    Wang, Yining
    Zhu, Jun
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) : 726 - 739
  • [4] Spectral Methods for Supervised Topic Models
    Wang, Yining
    Zhu, Jun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [5] Spectral Methods for Correlated Topic Models
    Arabshahi, Forough
    Anandkumar, Animashree
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 1439 - 1447
  • [6] Additive Regularization of Topic Models for Topic Selection and Sparse Factorization
    Vorontsov, Konstantin
    Potapenko, Anna
    Plavin, Alexander
    STATISTICAL LEARNING AND DATA SCIENCES, 2015, 9047 : 193 - 202
  • [7] A Topic Model Based on Poisson Decomposition
    Jiang, Haixin
    Zhou, Rui
    Zhang, Limeng
    Wang, Hua
    Zhang, Yanchun
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1489 - 1498
  • [8] Generating homograph models in topic modeling for expediting user's model selection
    Banswal, Diksha
    Nagori, Meghana
    Kshirsagar, Vivek
    2018 9TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2018,
  • [9] Probabilistic Word Selection via Topic Modeling
    Zhuang, Yueting
    Gao, Haidong
    Wu, Fei
    Tang, Siliang
    Zhang, Yin
    Zhang, Zhongfei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (06) : 1643 - 1655
  • [10] SpectralLeader: Online Spectral Learning for Single Topic Models
    Yu, Tong
    Kveton, Branislav
    Wen, Zheng
    Bui, Hung
    Mengshoel, Ole J.
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT II, 2019, 11052 : 379 - 395