Data Clustering Algorithms: Experimentation and Comparison

被引:0
|
作者
Khandare, Anand [1 ]
Pawar, Rutika [1 ]
机构
[1] Thakur Coll Engn & Technol, Dept Comp Engn, Mumbai, Maharashtra, India
关键词
Clustering; Data mining; KDD; K-Means clustering; DBSCAN; Agglomerative; Clusters;
D O I
10.1007/978-981-16-4863-2_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to increasing databases of all kinds, clustering has become one of the most essential tasks to classify the data. Clustering means to group or divide the data points based on their similarity to each other. Clustering can be stated as an unsupervised data mining technique that describes the nature of datasets. The main objective of data clustering is to obtain groups of similar entities. There are various methods of clustering such as hierarchical, partition-based, method-based, grid-based, and model-based. This paper provides a detailed study about clustering, its working processes. Along with basic information, detailed information about validity measures required to evaluate algorithms is discussed. This paper reviews clustering algorithms like K-Means, Agglomerative, and DBSCAN. A tabular comparison of algorithms is represented to acquire in-depth knowledge. The results obtained after experimenting with algorithms are also been discussed at the end.
引用
收藏
页码:86 / 99
页数:14
相关论文
共 50 条
  • [1] Comparison of Data Mining Clustering Algorithms
    Shah, Chintan
    Jivani, Anjali
    2013 4TH NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2013), 2013,
  • [2] COMPARISON OF ALGORITHMS FOR CLUSTERING INCOMPLETE DATA
    Matyja, Artur
    Siminski, Krzysztof
    FOUNDATIONS OF COMPUTING AND DECISION SCIENCES, 2014, 39 (02) : 107 - 127
  • [3] A Comparison of Clustering Algorithms for Data Streams
    Pereira, Cassio M. M.
    de Mello, Rodrigo F.
    INTEGRATED COMPUTING TECHNOLOGY, 2011, 165 : 59 - 74
  • [4] COMPARISON OF CLUSTERING ALGORITHMS: AN EXAMPLE WITH PROTEOMIC DATA
    Dasgupta, Nairanjana
    Chen, Yibing
    Kalyanaraman, Ananth
    Daoud, Sayed
    ADVANCES AND APPLICATIONS IN STATISTICS, 2013, 33 (01) : 63 - 81
  • [5] Unsupervised connectionist algorithms for clustering an environmental data set: A comparison
    Bougrain, L
    Alexandre, F
    NEUROCOMPUTING, 1999, 28 : 177 - 189
  • [6] NUMERICAL COMPARISON OF THE RFCM AND AP ALGORITHMS FOR CLUSTERING RELATIONAL DATA
    BEZDEK, JC
    HATHAWAY, RJ
    WINDHAM, MP
    PATTERN RECOGNITION, 1991, 24 (08) : 783 - 791
  • [7] Comparison of Clustering Algorithms in Text Clustering Tasks
    Gallardo Garcia, Rafael
    Beltran, Beatriz
    Vilarino, Darnes
    Zepeda, Claudia
    Martinez, Rodolfo
    COMPUTACION Y SISTEMAS, 2020, 24 (02): : 429 - 437
  • [8] Data Science at Udemy Agile Experimentation with Algorithms
    Wai, Larry
    PROCEEDINGS OF 2016 FUTURE TECHNOLOGIES CONFERENCE (FTC), 2016, : 355 - 360
  • [9] CLUSTERING ALGORITHMS FOR LIBRARY COMPARISON
    SRIDHAR, V
    MURTY, MN
    PATTERN RECOGNITION, 1991, 24 (09) : 815 - 823
  • [10] Comparison of algorithms for web document clustering using graph representations of data
    Schenker, A
    Last, M
    Bunke, H
    Kandell, A
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2004, 3138 : 190 - 197