Particle Swarm Optimization for Large-Scale Clustering on Apache Spark

被引:0
|
作者
Sherar, Matthew [1 ]
Zulkernine, Farhana [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a particle swarm optimization (PSO) clustering algorithm implemented in Apache Spark to achieve parallel big data clustering. Apache Spark is an in-memory big data analytics framework which uses parallel distributed processing to analyze large amount of data faster than most other existing data analytic tools. Spark's library of data analytic functions does not include the PSO algorithm. PSO is an evolutionary computing technique that has shown to produce more compact clusters than other partitional clustering techniques for a wide range of data. In addition PSO is a paralellizable and customizable algorithm well suited for multi-objective clustering problems. In this paper we present our implementation of a hybrid K-Means PSO (KMPSO) clustering algorithm in Apache Spark and demonstrate the performance gained in Spark by comparing our implementation with an implementation of KMPSO in MATLAB. We demonstrate that KMPSO can produce better clustering results than Spark's built-in clustering algorithms, and that Apache Spark enables efficient scaling of resources to handle large and complex workloads.
引用
收藏
页码:801 / 808
页数:8
相关论文
共 50 条
  • [31] Compressed-Coding Particle Swarm Optimization for Large-Scale Feature Selection
    Yang, Jia-Quan
    Zhan, Zhi-Hui
    Li, Tao
    [J]. COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT I, 2022, 1491 : 259 - 270
  • [32] Hybrid Particle Swarm Optimization Algorithm for Large-scale Travelling Salesman Problem
    Zhang, Jiangwei
    [J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1773 - 1778
  • [33] A Dual-Competition-Based Particle Swarm Optimizer for Large-Scale Optimization
    Gao, Weijun
    Peng, Xianjie
    Guo, Weian
    Li, Dongyang
    [J]. MATHEMATICS, 2024, 12 (11)
  • [34] Software module clustering using grid-based large-scale many-objective particle swarm optimization
    Amarjeet Prajapati
    [J]. Soft Computing, 2022, 26 : 8709 - 8730
  • [35] Spark-based parallel dynamic programming and particle swarm optimization via cloud computing for a large-scale reservoir system
    Ma, Yufei
    Zhong, Ping-an
    Xu, Bin
    Zhu, Feilin
    Lu, Qingwen
    Wang, Han
    [J]. JOURNAL OF HYDROLOGY, 2021, 598
  • [36] CenPSO: A Novel Center-based Particle Swarm Optimization Algorithm for Large-scale Optimization
    Mousavirad, Seyed Jalaleddin
    Rahnamayan, Shahryar
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 2066 - 2071
  • [37] Progressive Sampling Surrogate-Assisted Particle Swarm Optimization for Large-Scale Expensive Optimization
    Wang, Hong-Rui
    Chen, Chun-Hua
    Li, Yun
    Zhang, Jun
    Zhi-Hui-Zhan
    [J]. PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, : 40 - 48
  • [38] A reinforcement learning level-based particle swarm optimization algorithm for large-scale optimization
    Wang, Feng
    Wang, Xujie
    Sun, Shilei
    [J]. INFORMATION SCIENCES, 2022, 602 : 298 - 312
  • [39] Cooperative Particle Swarm Optimization With a Bilevel Resource Allocation Mechanism for Large-Scale Dynamic Optimization
    Liu, Xiao-Fang
    Zhang, Jun
    Wang, Jun
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (02) : 1000 - 1011
  • [40] Decomposition and merging cooperative particle swarm optimization with random grouping for large-scale optimization problems
    McNulty, Alanna
    Ombuki-Berman, Beatrice
    Engelbrecht, Andries
    [J]. SWARM INTELLIGENCE, 2024, 18 (2-3) : 141 - 166