Neural labeled LDA: a topic model for semi-supervised document classification

被引:0
|
作者
Wei Wang
Bing Guo
Yan Shen
Han Yang
Yaosen Chen
Xinhua Suo
机构
[1] Sichuan University,College of Computer Science
[2] Sobey Technology,Media Intelligence Laboratory
[3] Peng Cheng Laboratory,School of Computer Science
[4] Chengdu University of Information Technology,undefined
来源
Soft Computing | 2021年 / 25卷
关键词
Neural topic model; Semi-supervised learning; Document classification;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, some statistical topic modeling approaches based on LDA have been applied in the field of supervised document classification, where the model generation procedure incorporates prior knowledge to improve the classification performance. However, these customizations of topic modeling are limited by the cumbersome derivation of a specific inference algorithm for each modification. In this paper, we propose a new supervised topic modeling approach for document classification problems, Neural Labeled LDA (NL-LDA), which builds on the VAE framework, and designs a special generative network to incorporate prior information. The proposed model can support semi-supervised learning based on the manifold assumption and low-density assumption. Meanwhile, NL-LDA has a consistent and concise inference method while semi-supervised learning and predicting. Quantitative experimental results demonstrate our model has outstanding performance on supervised document classification relative to the compared approaches, including traditional statistical and neural topic models. Specially, the proposed model can support both single-label and multi-label document classification. The proposed NL-LDA performs significantly well on semi-supervised classification, especially under a small amount of labeled data. Further comparisons with related works also indicate our model is competitive with state-of-the-art topic modeling approaches on semi-supervised classification.
引用
收藏
页码:14561 / 14571
页数:10
相关论文
共 50 条
  • [31] Multi-view HAC for semi-supervised document image classification
    Carmagnac, F
    Héroux, P
    Trupin, E
    [J]. DOCUMENT ANALYSIS SYSTEMS VI, PROCEEDINGS, 2004, 3163 : 191 - 200
  • [32] Weighted Pseudo Labeled Data and Mutual Learning for Semi-Supervised Classification
    Mo, Jianwen
    Gan, Yuwan
    Yuan, Hua
    [J]. IEEE ACCESS, 2021, 9 : 36522 - 36534
  • [33] Weighted Pseudo Labeled Data and Mutual Learning for Semi-Supervised Classification
    Mo, Jianwen
    Gan, Yuwan
    Yuan, Hua
    [J]. IEEE Access, 2021, 9 : 36522 - 36534
  • [34] Semi-Supervised Learning for Anomaly Classification Using Partially Labeled Subsets
    Cohen, Joseph
    Ni, Jun
    [J]. JOURNAL OF MANUFACTURING SCIENCE AND ENGINEERING-TRANSACTIONS OF THE ASME, 2022, 144 (06):
  • [35] Semi-supervised ranking for document retrieval
    Duh, Kevin
    Kirchhoff, Katrin
    [J]. COMPUTER SPEECH AND LANGUAGE, 2011, 25 (02): : 261 - 281
  • [36] Exploiting the value of class labels on high-dimensional feature spaces: topic models for semi-supervised document classification
    Soleimani, Hossein
    Miller, David J.
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2019, 22 (02) : 299 - 309
  • [37] Exploiting the value of class labels on high-dimensional feature spaces: topic models for semi-supervised document classification
    Hossein Soleimani
    David J. Miller
    [J]. Pattern Analysis and Applications, 2019, 22 : 299 - 309
  • [38] Probabilistic labeled Semi-supervised SVM
    Qian, Mingjie
    Nie, Feiping
    Zhang, Changshui
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 394 - 399
  • [39] A semi-supervised convolutional neural network for hyperspectral image classification
    Liu, Bing
    Yu, Xuchu
    Zhang, Pengqiang
    Tan, Xiong
    Yu, Anzhu
    Xue, Zhixiang
    [J]. REMOTE SENSING LETTERS, 2017, 8 (09) : 839 - 848
  • [40] Active and Semi-Supervised Graph Neural Networks for Graph Classification
    Xie, Yu
    Lv, Shengze
    Qian, Yuhua
    Wen, Chao
    Liang, Jiye
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (04) : 920 - 932