A Novel Text Clustering Method Based on TGSOM and Fuzzy K-Means

被引:0
|
作者
Hu, Jinzhu [1 ]
Xiong, Chunxiu [1 ]
Shu, Jiangbo [1 ]
Zhou, Xing [1 ]
Zhu, Jun [1 ]
机构
[1] Hua Zhong Normal Univ, Dept Comp Sci, Wuhan 43079, Peoples R China
关键词
tree-structured growing self-organizing maps; Fuzzy K-Means; text clustering; text clustering flow model;
D O I
10.1109/ETCS.2009.14
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
According to the high-dimensional sparse features of the storage of the textual document, and defects existing in the clustering methods which have already studied by now and some other problems, an effective text clustering approach (short for TGSOM-FS-FKM) based on tree-structured growing self-organizing maps (TGSOM) and Fuzzy K-Means (FKM) is proposed. It firstly makes preprocess of texts, and filter the majority of noisy words by using unsupervised feature selection method. Then it used TGSOM to execute the first clustering to get the rough classification of texts, and to get the initial clustering number and each text's category. And then introduced LSA theory to improve the precision of clustering and reduce the dimension of feature vector. After that it used TGSOM to execute the second clustering to get the more precise clustering result, and used supervised feature selection method to select feature items. Finally, it used FKM to cluster the result set. In the experiment, it remained the same number of feature items. Experimental results indicate that TGSOM-FS-FKM clustering excels to other clustering method such as DSOM-FS-FCM, and the precision is better than DSOM-FCM, DFKCN and FDMFC clustering.
引用
收藏
页码:26 / 30
页数:5
相关论文
共 50 条
  • [1] A novel image text extraction method based on k-means clustering
    Song, Yan
    Liu, Anan
    Pang, Lin
    Lin, Shouxun
    Zhang, Yongdong
    Tang, Sheng
    [J]. 7TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE IN CONJUNCTION WITH 2ND IEEE/ACIS INTERNATIONAL WORKSHOP ON E-ACTIVITY, PROCEEDINGS, 2008, : 185 - 190
  • [2] A Fuzzy Clustering Algorithm Based on K-means
    Yan, Zhen
    Pi, Dechang
    [J]. ECBI: 2009 INTERNATIONAL CONFERENCE ON ELECTRONIC COMMERCE AND BUSINESS INTELLIGENCE, PROCEEDINGS, 2009, : 523 - 528
  • [3] A novel method for K-means clustering algorithm
    [J]. Zhao, Jinguo, 1600, Transport and Telecommunication Institute, Lomonosova street 1, Riga, LV-1019, Latvia (18):
  • [4] Chinese text clustering algorithm based k-means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    [J]. 2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 301 - 307
  • [5] Text Document Clustering Based on Density K-means
    Wu, Di
    Zeng, Yan
    Qu, Yin-chuan
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS AND ELECTRONIC ENGINEERING (CMEE 2016), 2016,
  • [6] Weighted k-Means Algorithm Based Text Clustering
    Chen, Xiuguo
    Yin, Wensheng
    Tu, Pinghui
    Zhang, Hengxi
    [J]. IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 51 - +
  • [7] Chinese Text Clustering Algorithm Based K-Means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    [J]. 2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 1, 2011, : 90 - 93
  • [8] A Novel MapReduce Based k-Means Clustering
    Sinha, Ankita
    Jana, Prasanta K.
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COMMUNICATION, 2017, 458 : 247 - 255
  • [9] A Clustering Method Based on K-Means Algorithm
    Li, Youguo
    Wu, Haiyan
    [J]. INTERNATIONAL CONFERENCE ON SOLID STATE DEVICES AND MATERIALS SCIENCE, 2012, 25 : 1104 - 1109
  • [10] Clustering of Image Data Using K-Means and Fuzzy K-Means
    Rahmani, Md. Khalid Imam
    Pal, Naina
    Arora, Kamiya
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163