Supervised latent semantic indexing for document categorization

被引:20
|
作者
Sun, JT [1 ]
Chen, Z [1 ]
Zeng, HJ [1 ]
Lu, YC [1 ]
Shi, CY [1 ]
Ma, WY [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci, Beijing 100084, Peoples R China
关键词
D O I
10.1109/ICDM.2004.10004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Latent Semantic Indexing (LSI) is a successful technology in information retrieval (IS) which attempts to explore the latent semantics implied by a query or a document through representing them in a dimension-reduced space. However LSI is not optimal for document categorization tasks because it aims to find the most representative features for document representation rather than the most discriminative ones. In this paper we propose Supervised LSI (SLSI) which selects the most discriminative basis vectors using the training data iteratively. The extracted vectors are then used to project the documents into a reduced dimensional space for better classification. Experimental evaluations show that the SLSI approach leads to dramatic dimension reduction while achieving good classification results.
引用
下载
收藏
页码:535 / 538
页数:4
相关论文
共 50 条
  • [1] Sprinkling: Supervised latent semantic indexing
    Chakraborti, Sutanu
    Lothian, Robert
    Wiratunga, Nirmalie
    Watt, Stuart
    ADVANCES IN INFORMATION RETRIEVAL, 2006, 3936 : 510 - 514
  • [2] Categorization of Malay Documents using Latent Semantic Indexing
    Ab Samat, Nordianah
    Murad, Masrah Azrifah Azmi
    Atan, Rodziah
    Abdullah, Muhammad Taufik
    KMICE 2008 - KNOWLEDGE MANAGEMENT INTERNATIONAL CONFERENCE, 2008 - TRANSFERRING, MANAGING AND MAINTAINING KNOWLEDGE FOR NATION CAPACITY DEVELOPMENT, 2008, : 87 - 91
  • [3] Supervised labeled latent Dirichlet allocation for document categorization
    Ximing Li
    Jihong Ouyang
    Xiaotang Zhou
    You Lu
    Yanhui Liu
    Applied Intelligence, 2015, 42 : 581 - 593
  • [4] Supervised labeled latent Dirichlet allocation for document categorization
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    Lu, You
    Liu, Yanhui
    APPLIED INTELLIGENCE, 2015, 42 (03) : 581 - 593
  • [5] A Weakly Supervised Optimize Method in Latent Semantic Indexing
    Ji, Duo
    Guo, Dongbo
    Cai, Dongfeng
    Bai, Yu
    IEEE NLP-KE 2009: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2009, : 308 - 314
  • [6] Supervised Latent Semantic Indexing Using Adaptive Sprinkling
    Chakraborti, Sutanu
    Mukras, Rahman
    Lothian, Robert
    Wiratunga, Nirmalie
    Watt, Stuart
    Harper, David
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1582 - 1587
  • [7] Framework for document retrieval using latent semantic indexing
    Phadnis, Neelam
    Gadge, Jayant
    International Journal of Computers and Applications, 2014, 94 (14) : 37 - 41
  • [8] Document Classification Method based on Latent Semantic Indexing
    Kim, Jeong-Joon
    Lee, Yong-Soo
    Moon, Jin-Yong
    Park, Jeong-Min
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2018, 11 (04): : 97 - 112
  • [9] A novel multilingual text categorization system using latent semantic indexing
    Lee, Chung-Hong
    Yang, Hsin-Chang
    Ma, Sheng-Min
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 503 - +
  • [10] Using Latent Semantic Indexing to Improve the Accuracy of Document Clustering
    Zhan, Jiaming
    Loh, Han Tong
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2007, 6 (03) : 181 - 188