Text clustering based on kernel KNN clustering algorithm

被引:0
|
作者
Xiong, Hao [1 ]
Sun, Sheng [1 ]
Feng, Yunfang [1 ]
机构
[1] Computer School, Hubei Polytechnic University, Huangshi 435003, Hubei, China
关键词
Attribute selection - Collection of documents - Document Clustering - Higher-dimensional - K-nearest neighbors - Kernel methods - Nonlinear functions - Text Clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Document clustering is a popular tool for automatically organizing a large collection of documents. In this paper, we propose a Kernel-based K-Nearest Neighbor (KKNNC) clustering algorithm based on the KNN method. Our algorithm maps samples into a higher-dimensional feature space using a nonlinear function before clustering, then in kernel space divides them linearly. We also propose a new attribute selection method-ATS??algorithm, which can select important terms in documents. Our algorithm first uses ATS to eliminate redundant attributes in data sets, then gives each of the selective attributes a weight value according to the relationship between these attributes. The experimental results show that our algorithm is effective in the text clustering task. © 2013 by CESER Publications.
引用
收藏
页码:69 / 75
相关论文
共 50 条
  • [41] Text Clustering Algorithm Based on Spectral Graph Seriation
    Guo Wensheng
    Li Guohe
    CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 4255 - 4259
  • [42] A text clustering algorithm based on weeds and differential optimization
    Yang L.
    Wang F.
    Fan C.
    International Journal of Database Theory and Application, 2016, 9 (12): : 121 - 130
  • [43] A parallel text document clustering algorithm based on neighbors
    Yanjun Li
    Congnan Luo
    Soon M. Chung
    Cluster Computing, 2015, 18 : 933 - 948
  • [44] Fuzzy Set Based Clustering Algorithm of Web Text
    Wan, Hongxin
    Peng, Yun
    ADVANCES IN MECHATRONICS AND CONTROL ENGINEERING III, 2014, 678 : 19 - +
  • [45] Text clustering algorithm based on deep representation learning
    Wang, Binyu
    Liu, Wenfen
    Lin, Zijie
    Hu, Xuexian
    Wei, Jianghong
    Liu, Chun
    JOURNAL OF ENGINEERING-JOE, 2018, (16): : 1407 - 1414
  • [46] A Text Classification Algorithm Based on Rocchio and Hierarchical Clustering
    Zeng, Anping
    Huang, Yongping
    ADVANCED INTELLIGENT COMPUTING, 2011, 6838 : 432 - +
  • [47] A text fuzzy clustering method based on genetic algorithm
    Xu, ZJ
    He, ZS
    Xuan, J
    Proceedings of the 11th Joint International Computer Conference, 2005, : 876 - 879
  • [48] Text Clustering Algorithm Based on Semantic Graph Structure
    Bai, Qiuchan
    Jin, Chunxia
    PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2016, : 312 - 316
  • [49] A parallel text document clustering algorithm based on neighbors
    Li, Yanjun
    Luo, Congnan
    Chung, Soon M.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02): : 933 - 948
  • [50] WTCA: A Web Text Clustering Algorithm Based on DFSSM
    Zheng, Yu
    Rong, Qian
    PROCEEDINGS OF THE 27TH CHINESE CONTROL CONFERENCE, VOL 5, 2008, : 811 - +