Document clustering using locality preserving indexing

被引:542
|
作者
Cai, D
He, XF
Han, JW
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[2] Univ Chicago, Dept Comp Sci, Chicago, IL 60637 USA
关键词
document clustering; locality preserving indexing; dimensionality reduction; semantics;
D O I
10.1109/TKDE.2005.198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel document clustering method which aims to cluster the documents into different semantic classes. The document space is generally of high dimensionality and clustering in such a high dimensional space is often infeasible due to the curse of dimensionality. By using Locality Preserving Indexing (LPI), the documents can be projected into a lower-dimensional semantic space in which the documents related to the same semantics are close to each other. Different from previous document clustering methods based on Latent Semantic Indexing (LSI) or Nonnegative Matrix Factorization (NMF), our method tries to discover both the geometric and discriminating structures of the document space. Theoretical analysis of our method shows that LPI is an unsupervised approximation of the supervised Linear Discriminant Analysis (LDA) method, which gives the intuitive motivation of our method. Extensive experimental evaluations are performed on the Reuters-21578 and TDT2 data sets.
引用
收藏
页码:1624 / 1637
页数:14
相关论文
共 50 条
  • [1] Document clustering using locality preserving indexing and support vector machines
    Chengfu Yang
    Zhang Yi
    Soft Computing, 2008, 12 : 677 - 683
  • [2] Document clustering using locality preserving indexing and support vector machines
    Yang, Chengfu
    Yi, Zhang
    SOFT COMPUTING, 2008, 12 (07) : 677 - 683
  • [3] Document Representation using Extended Locality Preserving Indexing
    Khalpada, Vaidehi S.
    Koringa, Purvi A.
    Mitra, Suman K.
    2019 IEEE 16TH INDIA COUNCIL INTERNATIONAL CONFERENCE (IEEE INDICON 2019), 2019,
  • [4] A kernelized spectral clustering method based on local affinity preserving indexing for document clustering
    1600, ICIC Express Letters Office, Tokai University, Kumamoto Campus, 9-1-1, Toroku, Kumamoto, 862-8652, Japan (07):
  • [5] LOCALITY PRESERVING SPEAKER CLUSTERING
    Chu, Stephen M.
    Tang, Hao
    Huang, Thomas S.
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 494 - +
  • [6] Using Locality Preserving Projections to Improve the Performance of Kernel Clustering
    Zhan, Mengmeng
    Lu, Guangquan
    Wen, Guoqiu
    Zhang, Leyuan
    Wu, Lin
    NEURAL PROCESSING LETTERS, 2020, 52 (03) : 1827 - 1842
  • [7] Using Locality Preserving Projections to Improve the Performance of Kernel Clustering
    Mengmeng Zhan
    Guangquan Lu
    Guoqiu Wen
    Leyuan Zhang
    Lin Wu
    Neural Processing Letters, 2020, 52 : 1827 - 1842
  • [8] Clustering joint Locality Preserving Projections
    Li, Yuanhao
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [9] Using Latent Semantic Indexing to Improve the Accuracy of Document Clustering
    Zhan, Jiaming
    Loh, Han Tong
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2007, 6 (03) : 181 - 188
  • [10] an Optimal Locality Preserving Indexing Algorithm for Text Mining
    Tao, Jian-Wen
    Cheng, Guang-Hua
    Lv, Xin-Rong
    Zhao, Jie-Yu
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 165 - +