Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer

被引:0
|
作者
Tansel Özyer
Reda Alhajj
机构
[1] TOBB ETU Economics and Technology University,Department of Computer Engineering
[2] University of Calgary,Department of Computer Science
来源
Applied Intelligence | 2009年 / 31卷
关键词
Clustering; Data mining; Multi-objective optimization; Validity analysis; Divide and conquer; Parallelism; Incremental clustering;
D O I
暂无
中图分类号
学科分类号
摘要
This paper applies divide and conquer approach in an iterative way to handle the clustering process. The target is a parallelized effective and efficient approach that produces the intended clustering result. We achieve scalability by first partitioning a large dataset into subsets of manageable sizes based on the specifications of the machine to be used in the clustering process; then cluster the partitions separately in parallel. The centroid of each obtained cluster is treated like the root of a tree with instances in its cluster as leaves. The partitioning and clustering process is iteratively applied on the centroids with the trees growing up until we get the final clustering; the outcome is a forest with one tree per cluster. Finally, a conquer process is performed to get the actual intended clustering, where each instance (leaf node) belongs to the final cluster represented by the root of its tree. We use multi-objective genetic algorithm combined with validity indices to decide on the number of classes. This approach fits well for interactive online clustering. It facilitates for incremental clustering because chunks of instances are clustered as stand alone sets, and then the results are merged with existing clusters. This is attractive and feasible because we consider the clustering of only centroids after the first clustering stage. The reported test results demonstrate the applicability and effectiveness of the proposed approach.
引用
收藏
页码:318 / 331
页数:13
相关论文
共 50 条
  • [1] Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer
    Ozyer, Tansel
    Alhajj, Reda
    [J]. APPLIED INTELLIGENCE, 2009, 31 (03) : 318 - 331
  • [2] Integrating multi-objective genetic algorithm based clustering and data partitioning for skyline computation
    Tansel Özyer
    Ming Zhang
    Reda Alhajj
    [J]. Applied Intelligence, 2011, 35 : 110 - 122
  • [3] Integrating multi-objective genetic algorithm based clustering and data partitioning for skyline computation
    Ozyer, Tansel
    Zhang, Ming
    Alhajj, Reda
    [J]. APPLIED INTELLIGENCE, 2011, 35 (01) : 110 - 122
  • [4] mQAPViz: A divide-and-conquer multi-objective optimization algorithm to compute large data visualizations
    Sanhueza, Claudio
    Jimenez, Francia
    Berretta, Regina
    Moscato, Pablo
    [J]. GECCO'18: PROCEEDINGS OF THE 2018 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2018, : 737 - 744
  • [5] A Multi-Objective Genetic Algorithm with Fuzzy Relational Clustering for Automatic Data Clustering
    Kundu, Animesh
    Paull, Animesh Kumar
    Shill, Pintu Chandra
    Murase, Kazuyuki
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL INFORMATION AND COMMUNICATION TECHNOLOGY (EICT), 2015, : 89 - 94
  • [6] A Novel Multi-Objective Genetic Algorithm for Clustering
    Kirkland, Oliver
    Rayward-Smith, Victor J.
    de la Iglesia, Beatriz
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2011, 2011, 6936 : 317 - 326
  • [7] A Parallel Genetic Algorithm in Multi-objective Optimization
    Wang Zhi-xin
    Ju Gang
    [J]. CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 3497 - 3501
  • [8] Integrating Multi-Objective Genetic Algorithm and Validity Analysis for Locating and Ranking Alternative Clustering
    Liu, Yimin
    Ozyer, Tansel
    Alhajj, Reda
    Barker, Ken
    [J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2005, 29 (01): : 33 - 40
  • [9] Divide-and-conquer memetic algorithm for online multi-objective test paper generation
    Minh Luan Nguyen
    Siu Cheung Hui
    Alvis C. M. Fong
    [J]. Memetic Computing, 2012, 4 : 33 - 47
  • [10] Divide-and-conquer memetic algorithm for online multi-objective test paper generation
    Nguyen, Minh Luan
    Hui, Siu Cheung
    Fong, Alvis C. M.
    [J]. MEMETIC COMPUTING, 2012, 4 (01) : 33 - 47