Document clustering with hierarchical algorithm

被引:0
|
作者
Wang, Y [1 ]
Hodges, J [1 ]
机构
[1] Mississippi State Univ, Dept Comp Sci & Engn, Mississippi State, MS 39762 USA
关键词
document clustering; information retrieval;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document clustering is a widely used strategy for information retrieval and text data mining. Partitioning and hierarchical clustering methods are most widely used algorithms. Other investigators proposed to use bisecting K-means method for document clustering and their experimental results have indicated that the bisecting K-means method is the preferred method for document clustering [16]. However, in our research we have found that, whereas the bisecting K-means method has advantages when working with large datasets, a traditional hierarchical clustering algorithm still achieves the best performance for small datasets.
引用
收藏
页码:1614 / 1617
页数:4
相关论文
共 50 条
  • [1] A Document Clustering Method based on Hierarchical Algorithm with Model Clustering
    Sun, Haojun
    Liu, Zhihui
    Kong, Lingjun
    [J]. 2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3, 2008, : 1229 - +
  • [2] Hierarchical Star Clustering Algorithm for Dynamic Document Collections
    Gil-Garcia, Reynaldo
    Pons-Porrata, Aurora
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2008, 5197 : 187 - 194
  • [3] An incremental document clustering algorithm based on a hierarchical agglomerative approach
    Joo, KH
    Lee, SJ
    [J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 321 - 332
  • [4] An Improved Hierarchical K-Means Algorithm for Web Document Clustering
    Liu, Yongxin
    Liu, Zhijng
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, 2008, : 606 - 610
  • [5] Distributed hierarchical document clustering
    Deb, Debzani
    Fuad, M. Muztaba
    Angryk, Rafal A.
    [J]. PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER SCIENCE AND TECHNOLOGY, 2006, : 328 - +
  • [6] A novel hierarchical document clustering algorithm based on a kNN connection graph
    Zhu, Qiaoming
    Li, Junhui
    Zhou, Guodong
    Li, Peifeng
    Qian, Peide
    [J]. COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 120 - +
  • [7] A Speed-Up Hierarchical Compact Clustering Algorithm for Dynamic Document Collections
    Gil-Garcia, Reynaldo
    Pons-Porrata, Aurora
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS, 2009, 5856 : 379 - 386
  • [8] A hierarchical feature decomposition clustering algorithm for unsupervised classification of document image types
    Curtis, Dean
    Kubushyn, Vitaliy
    Yfantis, E. A.
    Rogers, Michael
    [J]. ICMLA 2007: SIXTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2007, : 423 - 428
  • [9] Hierarchical Clustering based on IndoorGML Document
    Tamas, Judit
    [J]. 2019 IEEE 15TH INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATICS (INFORMATICS 2019), 2019, : 177 - 182
  • [10] Dynamic hierarchical algorithms for document clustering
    Gil-Garcia, Reynaldo
    Pons-Porrata, Aurora
    [J]. PATTERN RECOGNITION LETTERS, 2010, 31 (06) : 469 - 477