An Improved LDA Algorithm for Text Classification

被引:0
|
作者
Zhao, Dexin [1 ]
He, Jinqun [1 ]
Liu, Jin [2 ]
机构
[1] Tianjin Univ Technol, Tianjin Key Lab Intelligent Comp & Novel Software, Tianjin 300384, Peoples R China
[2] Tianjin Keyilong Decorat Engn Co Ltd, Tianjin 300202, Peoples R China
关键词
topic model; LDA; text classification;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Latent Dirichlet Allocation is a classic topic model which can extract latent topic from large data corpus. This model assumes that if a document is relevant to a topic, then all tokens in the document are relevant to that topic. In this paper, we present an algorithm called gLDA for topic text classification by adding topic-category distribution parameter to LDA, which can make the document generated from the most relevant category. Gibbs sampling is employed to conduct approximate inference, and experiment results in two datasets show the effectiveness of this method.
引用
收藏
页码:216 / +
页数:2
相关论文
共 50 条
  • [1] An Improved KNN Algorithm for Text Classification
    Li, Huijuan
    Jiang, He
    Wang, Dongyuan
    Han, Bing
    2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, : 1081 - 1085
  • [2] An improved TFIDF Algorithm in text classification
    Xu, Dongdong
    Wu, Shaobo
    MATERIAL SCIENCE, CIVIL ENGINEERING AND ARCHITECTURE SCIENCE, MECHANICAL ENGINEERING AND MANUFACTURING TECHNOLOGY II, 2014, 651-653 : 2258 - 2261
  • [3] An Improved KNN Algorithm in Text Classification
    Wang, Xiaoni
    Zhang, Zhenjiang
    Cao, Wei
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND COMPUTER APPLICATIONS (ICSA 2013), 2013, 92 : 263 - 268
  • [4] A news classification applied with new text representation based on the improved LDA
    Shao, Dangguo
    Li, Chengyao
    Huang, Chusheng
    Xiang, Yan
    Yu, Zhengtao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (15) : 21521 - 21545
  • [5] A news classification applied with new text representation based on the improved LDA
    Dangguo Shao
    Chengyao Li
    Chusheng Huang
    Yan Xiang
    Zhengtao Yu
    Multimedia Tools and Applications, 2022, 81 : 21521 - 21545
  • [6] Improved Algorithm Based on TFIDF in Text Classification
    Jiang, Hao
    Li, Wenqiang
    MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1791 - 1794
  • [7] Improved algorithm for text classification based on TSVM
    Teng, Guifa
    Liu, Yihong
    Ma, Jianbin
    Wang, Fang
    Ya, Huiting
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 55 - +
  • [8] Dataless Text Classification with Descriptive LDA
    Chen, Xingyuan
    Xia, Yunqing
    Jin, Peng
    Carroll, John
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2224 - 2231
  • [9] Classification of text to subject using LDA
    Smith, Douglas A.
    McManis, Charles
    2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 131 - 135
  • [10] An improved ant algorithm with LDA-based representation for text document clustering
    Onan, Aytug
    Bulut, Hasan
    Korukoglu, Serdar
    JOURNAL OF INFORMATION SCIENCE, 2017, 43 (02) : 275 - 292