Comparison of Clustering Algorithms in Text Clustering Tasks

被引:2
|
作者
Gallardo Garcia, Rafael [1 ]
Beltran, Beatriz [1 ,2 ]
Vilarino, Darnes [1 ,2 ]
Zepeda, Claudia [1 ]
Martinez, Rodolfo [1 ]
机构
[1] Benemerita Univ Autonoma Puebla, Fac Comp Sci, Puebla, Mexico
[2] Benemerita Univ Autonoma Puebla, Language & Knowledge Engn Lab, Puebla, Mexico
来源
COMPUTACION Y SISTEMAS | 2020年 / 24卷 / 02期
关键词
Affinity propagation; f-measure; k-means; spectral clustering; PAN;
D O I
10.13053/CyS-24-2-3369
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of this paper is to compare the performance and accuracy of several clustering algorithms in text clustering tasks. The text preprocessing were realized by using the Term Frequency - Inverse Document Frequency in order to obtain weights for each word in each text and then obtain weights for each text. The Cosine Similarity was used as the similarity measure between the texts. The clustering tasks were realized over the PAN dataset and three different algorithms were used: Affinity Propagation, K-Means and Spectral Clustering. This paper presents the results in comparative tables: ID of the task, ground truth clusters and the clusters generated by the algorithms. A table with precision, recall and f-measure scores is presented.
引用
收藏
页码:429 / 437
页数:9
相关论文
共 50 条
  • [31] A Systematic Comparison of Genome Scale Clustering Algorithms
    Jay, Jeremy J.
    Eblen, John D.
    Zhang, Yun
    Benson, Mikael
    Perkins, Andy D.
    Saxton, Arnold M.
    Voy, Brynn H.
    Chesler, Elissa J.
    Langston, Michael A.
    BIOINFORMATICS RESEARCH AND APPLICATIONS, 2011, 6674 : 416 - +
  • [32] A comparison of clustering algorithms in article recommendation system
    Tantanasiriwong, Supaporn
    FOURTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2011): MACHINE VISION, IMAGE PROCESSING, AND PATTERN ANALYSIS, 2012, 8349
  • [33] A comparison study of clustering algorithms for microblog posts
    Lin Li
    Jingjing Ye
    Fang Deng
    Shengwu Xiong
    Luo Zhong
    Cluster Computing, 2016, 19 : 1333 - 1345
  • [34] The Comparison of Clustering Algorithms for Network Intrusion Detection
    Tong, Hongyan
    Zhu, Anmin
    Guo, Yanmei
    INTERNATIONAL CONFERENCE ON ELECTRICAL AND CONTROL ENGINEERING (ICECE 2015), 2015, : 702 - 707
  • [35] A Comparison of Unsupervised Learning Algorithms for Gesture Clustering
    Ball, Adrian
    Rye, David
    Ramos, Fabio
    Velonaki, Mari
    PROCEEDINGS OF THE 6TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTIONS (HRI 2011), 2011, : 111 - 112
  • [36] COMPARISON OF CLUSTERING ALGORITHMS: AN EXAMPLE WITH PROTEOMIC DATA
    Dasgupta, Nairanjana
    Chen, Yibing
    Kalyanaraman, Ananth
    Daoud, Sayed
    ADVANCES AND APPLICATIONS IN STATISTICS, 2013, 33 (01) : 63 - 81
  • [37] Comparison of clustering algorithms for analog modulation classification
    Guldemir, H
    Sengur, A
    EXPERT SYSTEMS WITH APPLICATIONS, 2006, 30 (04) : 642 - 649
  • [38] Comparison of clustering algorithms in the context of software evolution
    Wu, JW
    Hassan, AE
    Holt, RC
    ICSM 2005: PROCEEDINGS OF THE 21ST IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2005, : 525 - 535
  • [39] Genetic algorithms for clustering and fuzzy clustering
    Bandyopadhyay, Sanghamitra
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (06) : 524 - 531
  • [40] Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks
    Selvaraj, Suganya
    Choi, Eunmi
    SENSORS, 2021, 21 (09)