Model Selection for Topic Models via Spectral Decomposition

被引:0
|
作者
Cheng, Dehua [1 ]
He, Xinran [1 ]
Liu, Yan [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90007 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic models have achieved significant successes in analyzing large-scale text corpus. In practical applications, we are always confronted with the challenge of model selection, i.e., how to appropriately set the number of topics. Following the recent advances in topic models via tensor decomposition, we make a first attempt to provide theoretical analysis on model selection in latent Dirichlet allocation. With mild conditions, we derive the upper bound and lower bound on the number of topics given a text collection of finite size. Experimental results demonstrate that our bounds are correct and tight. Furthermore, using Gaussian mixture model as an example, we show that our methodology can be easily generalized to model selection analysis in other latent models.
引用
收藏
页码:183 / 191
页数:9
相关论文
共 50 条
  • [21] Multilinear Sparse Decomposition for Best Spectral Bands Selection
    Bouchech, Hamdi Jamel
    Foufou, Sebti
    Abidi, Mongi
    IMAGE AND SIGNAL PROCESSING, ICISP 2014, 2014, 8509 : 384 - 391
  • [22] Spectral data reduction via wavelet decomposition
    Kaewpijit, S
    Le Moigne, J
    El-Ghazawi, T
    WAVELET AND INDEPENDENT COMPONENET ANALYSIS APPLICATIONS IX, 2002, 4738 : 56 - 63
  • [23] Model Selection via Bayesian Information Criterion for Quantile Regression Models
    Lee, Eun Ryung
    Noh, Hohsuk
    Park, Byeong U.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (505) : 216 - 229
  • [24] A spectral decomposition for a simple mutation model
    Moehle, Martin
    ELECTRONIC COMMUNICATIONS IN PROBABILITY, 2019, 24
  • [25] Topic aspect-oriented summarization via group selection
    Fang, Hanyin
    Lu, Weiming
    Wu, Fei
    Zhang, Yin
    Shang, Xindi
    Shao, Jian
    Zhuang, Yueting
    NEUROCOMPUTING, 2015, 149 : 1613 - 1619
  • [26] An Open Domain Topic Prediction Model for Answer Selection
    Yan, Zhao
    Duan, Nan
    Zhou, Ming
    Li, Zhoujun
    Zhou, Jianshe
    NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 312 - 323
  • [27] Dominant Codewords Selection with Topic Model for Action Recognition
    Kataoka, Hirokatsu
    Iwata, Kenji
    Satoh, Yutaka
    Hayashi, Masaki
    Aoki, Yoshimitsu
    Ilic, Slobodan
    PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 770 - 777
  • [28] Transferring near infrared spectral calibration models without standards via multistep wavelength selection
    Ni, Lijun
    Zhang, Zhange
    Zhang, Liguo
    Luan, Shaorong
    JOURNAL OF NEAR INFRARED SPECTROSCOPY, 2023, 31 (04) : 171 - 185
  • [29] Pairwise Topic Model via Relation Extraction
    Song, Xiaoli
    Shang, Yue
    Ling, Yuan
    Liu, Mengwen
    Hu, Xiaohua
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [30] Language model adaptation through topic decomposition and MDI estimation
    Federico, M
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 773 - 776