Robust discriminant analysis of latent semantic feature for text categorization

被引:0
|
作者
Hu, Jiani [1 ]
Deng, Weihong [1 ]
Guo, Jun [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a Discriminative Semantic Feature (DSF) method for vector space model based text categorization. The DSF method, which involves two stages, first reduces the dimension of the document vector space by Latent Semantic Indexing (LSI), and then applies a Robust linear Discriminant analysis Model (RDM), which improves the classical LDA by a energy-adaptive regularization criteria, to extract the discriminative semantic feature with enhanced discrimination power. As a result, DSF method can not only uncover latent semantic structure but also capture the discriminative feature. Comparative experiments on various state-of-art dimension reduction schemes such as our DSF, LSI, orthogonal centroid, two-stage LSI+LDA, LDA/QR and LDA/GSVD, are also performed. Experiments using the Reuters-21578 text collection show the proposed method performs better than other algorithms.
引用
收藏
页码:400 / 409
页数:10
相关论文
共 50 条
  • [21] Latent Semantic Analysis: An Approach to Understand Semantic of Text
    Kherwa, Pooja
    Bansal, Poonam
    2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 870 - 874
  • [22] A Novel Chinese Text Feature Selection Method Based on Probability Latent Semantic Analysis
    Zhong, Jiang
    Deng, Xiongbing
    Liu, Jie
    Li, Xue
    Liang, Chuanwei
    ADVANCES IN NEURAL NETWORKS - ISNN 2010, PT 2, PROCEEDINGS, 2010, 6064 : 276 - +
  • [23] Action categorization by structural probabilistic latent semantic analysis
    Zhang, Jianguo
    Gong, Shaogang
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (08) : 857 - 864
  • [24] Medical Text Categorization using SEBLA and Kernel Discriminant Analysis
    Tahir, Muhammad Atif
    Khan, Emdad
    Al Salem, Adel
    2015 2ND WORLD SYMPOSIUM ON WEB APPLICATIONS AND NETWORKING (WSWAN), 2015,
  • [25] Dimensionality reduction by combining category information and latent semantic index for text categorization
    Zheng, Wenbin
    An, Lixin
    Xu, Zhanyi
    Journal of Information and Computational Science, 2013, 10 (08): : 2463 - 2469
  • [26] A Comprehensive Analysis of using Semantic Information in Text Categorization
    Celik, Kerem
    Gungor, Tunga
    2013 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (IEEE INISTA), 2013,
  • [27] Fast text categorization using concise semantic analysis
    Li Zhixing
    Xiong Zhongyang
    Zhang Yufang
    Liu Chunyong
    Li Kuan
    PATTERN RECOGNITION LETTERS, 2011, 32 (03) : 441 - 448
  • [28] Improved Robust Discriminant Analysis for Feature Extraction
    Chen, Xiaobo
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1444 - 1449
  • [29] An Adaptive Latent Semantic Analysis for Text mining
    Hong T. Tu
    Tuoi T. Phan
    Khu P. Nguyen
    2017 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2017, : 588 - 593
  • [30] Text summarization using Latent Semantic Analysis
    Ozsoy, Makbule Gulcin
    Alpaslan, Ferda Nur
    Cicekli, Ilyas
    JOURNAL OF INFORMATION SCIENCE, 2011, 37 (04) : 405 - 417