A parallel metaheuristic data clustering framework for cloud

被引:19
|
作者
Tsai, Chun-Wei [1 ]
Liu, Shi-Jui
Wang, Yi-Chung
机构
[1] Natl Chung Hsing Univ, Dept Comp Sci & Engn, Taichung, Taiwan
关键词
Metaheuristic algorithm; Internet of things; Data clustering problem; GENETIC ALGORITHM; INTERNET; THINGS; SPARK; SERVICES; FUSION;
D O I
10.1016/j.jpdc.2017.10.020
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A high performance data analytics for internet of things (IoT) has been a promising research subject in recent years because traditional data mining algorithms may not be applicable to big data of IoT. One of the main reasons is that the data that need to be analyzed may exceed the storage size of a single machine. The computation cost of data analysis tasks that is too high for a single computer system is another critical problem we have to confront when analyzing data from an IoT system. That is why an efficient data clustering framework for metaheuristic algorithm on a cloud computing environment is presented in this paper for data analytics, which explains how to divide mining tasks of a mining algorithm into different nodes (i.e., the Map process) and then aggregate the mining results from these nodes (i.e., Reduce process). We further attempted to use the proposed framework to implement data clustering algorithms (e.g., k-means, genetic k-means, and particle swarm optimization) on a standalone system and Spark. The experimental results show that the performance of the proposed framework makes it useful to develop data clustering algorithms on a cloud computing environment. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:39 / 49
页数:11
相关论文
共 50 条
  • [31] Continuous and Parallel LiDAR Point-cloud Clustering
    Najdataei, Hannaneh
    Nikolakopoulos, Yiannis
    Gulisano, Vincenzo
    Papatriantafilou, Marina
    2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 671 - 684
  • [32] A Framework for Clustering Uncertain Data
    Schubert, Erich
    Koos, Alexander
    Emrich, Tobias
    Zuefle, Andreas
    Schmid, Klaus Arthur
    Zimek, Arthur
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (12): : 1977 - 1980
  • [33] An MDL framework for data clustering
    Kontkanen, P
    Myllymäki, P
    Buntine, W
    Rissanen, J
    Tirri, H
    ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 323 - 353
  • [34] Parallel Metaheuristic Algorithms for Solving Imbalanced Data Classification Problems
    Alweshah, Mohammed
    Almiani, Muder
    Alkhalaileh, Saleh
    Kassaymeh, Sofian
    Hezzam, Essa Abdullah
    Alomoush, Waleed
    IEEE ACCESS, 2023, 11 : 114443 - 114458
  • [35] A Framework for Cloud Data Security
    Grover, Ankit
    Kaur, Banpreet
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 1199 - 1203
  • [36] DeGTeC: A deep graph-temporal clustering framework for data-parallel job characterization in data centers
    Liang, Yi
    Chen, Kaizhong
    Yi, Lan
    Su, Xing
    Jin, Xiaoming
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 141 : 81 - 95
  • [37] Online clustering of parallel data streams
    Beringer, Juergen
    Huellermeier, Eyke
    DATA & KNOWLEDGE ENGINEERING, 2006, 58 (02) : 180 - 204
  • [38] MAP-SDN: a metaheuristic assignment and provisioning SDN framework for cloud datacenters
    Alireza Farshin
    Saeed Sharifian
    The Journal of Supercomputing, 2017, 73 : 4112 - 4136
  • [39] MAP-SDN: a metaheuristic assignment and provisioning SDN framework for cloud datacenters
    Farshin, Alireza
    Sharifian, Saeed
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (09): : 4112 - 4136
  • [40] An energy-aware migration framework using metaheuristic algorithm in cloud computing
    Singhal, Saurabh
    Sharma, Ashish
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, 67 (02) : 1373 - 1398