Supervised labeled latent Dirichlet allocation for document categorization

被引:0
|
作者
Ximing Li
Jihong Ouyang
Xiaotang Zhou
You Lu
Yanhui Liu
机构
[1] Jilin University,College of Computer Science and Technology
[2] Jilin University,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education
来源
Applied Intelligence | 2015年 / 42卷
关键词
Supervised; Topic modeling; Latent Dirichlet allocation; Multi-label classification;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, supervised topic modeling approaches have received considerable attention. However, the representative labeled latent Dirichlet allocation (L-LDA) method has a tendency to over-focus on the pre-assigned labels, and does not give potentially lost labels and common semantics sufficient consideration. To overcome these problems, we propose an extension of L-LDA, namely supervised labeled latent Dirichlet allocation (SL-LDA), for document categorization. Our model makes two fundamental assumptions, i.e., Prior 1 and Prior 2, that relax the restriction of label sampling and extend the concept of topics. In this paper, we develop a Gibbs expectation-maximization algorithm to learn the SL-LDA model. Quantitative experimental results demonstrate that SL-LDA is competitive with state-of-the-art approaches on both single-label and multi-label corpora.
引用
收藏
页码:581 / 593
页数:12
相关论文
共 50 条
  • [1] Supervised labeled latent Dirichlet allocation for document categorization
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    Lu, You
    Liu, Yanhui
    [J]. APPLIED INTELLIGENCE, 2015, 42 (03) : 581 - 593
  • [2] Latent Dirichlet Allocation for Automatic Document Categorization
    Biro, Istvan
    Szabo, Jacint
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 430 - 441
  • [3] On the Effectiveness of Labeled Latent Dirichlet Allocation in Automatic Bug-Report Categorization
    Zibran, Minhaz F.
    [J]. 2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C), 2016, : 713 - 715
  • [4] Semi-Supervised Latent Dirichlet Allocation and its Application for Document Classification
    Wang, Di
    Thint, Marcus
    Al-Rubaie, Ahmad
    [J]. 2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS (WI-IAT WORKSHOPS 2012), VOL 3, 2012, : 306 - 310
  • [5] Semi-supervised Document Clustering Based on Latent Dirichlet Allocation (LDA)
    秦永彬
    李解
    黄瑞章
    李晶
    [J]. Journal of Donghua University(English Edition), 2016, 33 (05) : 685 - 688
  • [6] INFERENCE IN SUPERVISED LATENT DIRICHLET ALLOCATION
    Lakshminarayanan, Balaji
    Raich, Raviv
    [J]. 2011 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2011,
  • [7] Labeled Phrase Latent Dirichlet Allocation
    Tang, Yi-Kun
    Mao, Xian-Ling
    Huang, Heyan
    [J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2016, PT I, 2016, 10041 : 525 - 536
  • [8] Type-2 Fuzzy Labeled Latent Dirichlet Allocation for Human Action Categorization
    Cao, Xiao-Qin
    Liu, Zhi-Qiang
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1338 - 1341
  • [9] Supervised latent semantic indexing for document categorization
    Sun, JT
    Chen, Z
    Zeng, HJ
    Lu, YC
    Shi, CY
    Ma, WY
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 535 - 538
  • [10] Using Latent Dirichlet Allocation for Automatic Categorization of Software
    Tian, Kai
    Revelle, Meghan
    Poshyvanyk, Denys
    [J]. 2009 6TH IEEE INTERNATIONAL WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES, 2009, : 163 - 166