A parallel metaheuristic data clustering framework for cloud

被引:19
|
作者
Tsai, Chun-Wei [1 ]
Liu, Shi-Jui
Wang, Yi-Chung
机构
[1] Natl Chung Hsing Univ, Dept Comp Sci & Engn, Taichung, Taiwan
关键词
Metaheuristic algorithm; Internet of things; Data clustering problem; GENETIC ALGORITHM; INTERNET; THINGS; SPARK; SERVICES; FUSION;
D O I
10.1016/j.jpdc.2017.10.020
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A high performance data analytics for internet of things (IoT) has been a promising research subject in recent years because traditional data mining algorithms may not be applicable to big data of IoT. One of the main reasons is that the data that need to be analyzed may exceed the storage size of a single machine. The computation cost of data analysis tasks that is too high for a single computer system is another critical problem we have to confront when analyzing data from an IoT system. That is why an efficient data clustering framework for metaheuristic algorithm on a cloud computing environment is presented in this paper for data analytics, which explains how to divide mining tasks of a mining algorithm into different nodes (i.e., the Map process) and then aggregate the mining results from these nodes (i.e., Reduce process). We further attempted to use the proposed framework to implement data clustering algorithms (e.g., k-means, genetic k-means, and particle swarm optimization) on a standalone system and Spark. The experimental results show that the performance of the proposed framework makes it useful to develop data clustering algorithms on a cloud computing environment. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:39 / 49
页数:11
相关论文
共 50 条
  • [41] Efficient parallel spectral clustering algorithm design for large data sets under cloud computing environment
    Jin R.
    Kou C.
    Liu R.
    Li Y.
    Journal of Cloud Computing, 2013, 2 (01)
  • [42] Parallel and Distributed Data Mining in Cloud
    Kholod, Ivan
    Kuprianov, Mikhail
    Petukhov, Ilya
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, 2016, 9728 : 349 - 362
  • [43] Cloud services cost comparison: a clustering analysis framework
    George Fragiadakis
    Vasiliki Liagkou
    Evangelia Filiopoulou
    Dimitris Fragkakis
    Christos Michalakelis
    Mara Nikolaidou
    Computing, 2023, 105 : 2061 - 2088
  • [44] Cloud services cost comparison: a clustering analysis framework
    Fragiadakis, George
    Liagkou, Vasiliki
    Filiopoulou, Evangelia
    Fragkakis, Dimitris
    Michalakelis, Christos
    Nikolaidou, Mara
    COMPUTING, 2023, 105 (10) : 2061 - 2088
  • [45] Basketball Big Data and Visual Management System under Metaheuristic Clustering
    Xia, Hailong
    Liu, Long
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [46] Metaheuristic Based Clustering with Deep Learning Model for Big Data Classification
    Krishnaswamy, R.
    Subramaniam, Kamalraj
    Nandini, V
    Vijayalakshmi, K.
    Kadry, Seifedine
    Nam, Yunyoung
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2023, 44 (01): : 391 - 406
  • [47] A Cloud-Based Data Mining Framework for Improved Clinical Diagnosis through Parallel Classification
    Lokeswari, Y. V.
    Jacob, Shomona Gracia
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2015, : 583 - 588
  • [48] A new metaheuristic algorithm based on water wave optimization for data clustering
    Kaur, Arvinder
    Kumar, Yugal
    EVOLUTIONARY INTELLIGENCE, 2022, 15 (01) : 759 - 783
  • [49] A new metaheuristic algorithm based on water wave optimization for data clustering
    Arvinder Kaur
    Yugal Kumar
    Evolutionary Intelligence, 2022, 15 : 759 - 783
  • [50] A PARALLEL METAHEURISTIC FRAMEWORK BASED ON HARMONY SEARCH FOR SCHEDULING IN DISTRIBUTED COMPUTING SYSTEMS
    Lee, Young Choon
    Taheri, Javid
    Zomaya, Albert Y.
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2012, 23 (02) : 445 - 464