An Improved Hierarchical K-Means Algorithm for Web Document Clustering

被引:3
|
作者
Liu, Yongxin [1 ]
Liu, Zhijng [1 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China
关键词
D O I
10.1109/ICCSIT.2008.152
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to conquer the major challenges of current web document clustering, i.e. huge volume of documents, high dimensional process, we proposed a simple agglomerative hierarchical K-Means clustering (SAHKC) algorithm based on H-K (hierarchical K-Means) algorithm, and a new model was used in this paper to describe the web document, named as multiple feature vector space model (MFVSM). Experimental results indicate that: the MFVSM is helpful in improving the quality of clustering result, and compare with the H-K algorithm, the SAHKC algorithm running time reduce nearly 30%, however the average precision of clustering result only reduce about 10%.
引用
收藏
页码:606 / 610
页数:5
相关论文
共 50 条
  • [1] An Improved K-means Algorithm for Document Clustering
    Wu, Guohua
    Lin, Hairong
    Fu, Ershuai
    Wang, Liuyang
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND MECHANICAL AUTOMATION (CSMA), 2015, : 65 - 69
  • [2] Improved Document Clustering using K-means Algorithm
    Bide, Pramod
    Shedge, Rajashree
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [3] A Kind of Hierarchical K-means Web Log Clustering Algorithm
    Liu Li Xia
    Zhuang Yi Qi
    [J]. ADVANCED MEASUREMENT AND TEST, PARTS 1 AND 2, 2010, 439-440 : 481 - 485
  • [4] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    [J]. 2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [5] An Improved K-means Clustering Algorithm
    Wang Yintong
    Li Wanlong
    Gao Rujia
    [J]. 2012 WORLD AUTOMATION CONGRESS (WAC), 2012,
  • [6] Improved K-means clustering algorithm
    Zhang, Zhe
    Zhang, Junxi
    Xue, Huifeng
    [J]. CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 5, PROCEEDINGS, 2008, : 169 - 172
  • [7] An improved K-means clustering algorithm
    Huang, Xiuchang
    Su, Wei
    [J]. Journal of Networks, 2014, 9 (01) : 161 - 167
  • [8] Improved Algorithm for the k-means Clustering
    Zhang, Sheng
    Wang, Shouqiang
    [J]. PROCEEDINGS OF THE 10TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2012), 2012, : 4717 - 4720
  • [9] Harmony K-means algorithm for document clustering
    Mahdavi, Mehrdad
    Abolhassani, Hassan
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2009, 18 (03) : 370 - 391
  • [10] Harmony K-means algorithm for document clustering
    Mehrdad Mahdavi
    Hassan Abolhassani
    [J]. Data Mining and Knowledge Discovery, 2009, 18 : 370 - 391