Robust discriminant analysis of latent semantic feature for text categorization

被引:0
|
作者
Hu, Jiani [1 ]
Deng, Weihong [1 ]
Guo, Jun [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a Discriminative Semantic Feature (DSF) method for vector space model based text categorization. The DSF method, which involves two stages, first reduces the dimension of the document vector space by Latent Semantic Indexing (LSI), and then applies a Robust linear Discriminant analysis Model (RDM), which improves the classical LDA by a energy-adaptive regularization criteria, to extract the discriminative semantic feature with enhanced discrimination power. As a result, DSF method can not only uncover latent semantic structure but also capture the discriminative feature. Comparative experiments on various state-of-art dimension reduction schemes such as our DSF, LSI, orthogonal centroid, two-stage LSI+LDA, LDA/QR and LDA/GSVD, are also performed. Experiments using the Reuters-21578 text collection show the proposed method performs better than other algorithms.
引用
收藏
页码:400 / 409
页数:10
相关论文
共 50 条
  • [31] A Novel Feature Selection Method Based on Probability Latent Semantic Analysis for Chinese Text Classification
    Zhong Jiang
    Sun Qigan
    Li Xue
    Wen Luosheng
    CHINESE JOURNAL OF ELECTRONICS, 2011, 20 (02): : 228 - 232
  • [32] TOFA: Trace Oriented Feature Analysis in Text Categorization
    Yan, Jun
    Liu, Ning
    Yang, Qiang
    Fan, Weiguo
    Chen, Zheng
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 668 - +
  • [33] Latent Factor SVM for Text Categorization
    Zhou, Xiaofei
    Guo, Li
    Liu, Ping
    Liu, Yanbing
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 105 - 110
  • [34] A Comparative Analysis of Strategies for Semantic Short-Text Categorization
    Rosas, Maria V.
    Errecalde, Marcelo L.
    Rosso, Paolo
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (44): : 11 - 18
  • [35] Medical Record Text Analysis Based on Latent Semantic Analysis
    Jin, Xinyu
    Ma, Wentao
    Li, Yunze
    2015 8TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2015, : 108 - 110
  • [36] Text structure analysis based on latent semantic indexing
    Lin, Hongfei
    Zhan, Xuegang
    Yao, Tianshun
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2000, 13 (01): : 47 - 51
  • [37] Automatic Text Summarization Using Latent Semantic Analysis
    Mashechkin, I. V.
    Petrovskiy, M. I.
    Popov, D. S.
    Tsarev, D. V.
    PROGRAMMING AND COMPUTER SOFTWARE, 2011, 37 (06) : 299 - 305
  • [38] Random indexing of text samples for latent semantic analysis
    Kanerva, P
    Kristoferson, J
    Holst, H
    PROCEEDINGS OF THE TWENTY-SECOND ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 2000, : 1036 - 1036
  • [39] Robust discriminant analysis of Gabor feature for face recognition
    Deng, Weihong
    Hu, Jiani
    Guo, Jun
    Zhang, Honggang
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 248 - +
  • [40] Automatic text summarization using latent semantic analysis
    I. V. Mashechkin
    M. I. Petrovskiy
    D. S. Popov
    D. V. Tsarev
    Programming and Computer Software, 2011, 37 : 299 - 305