Big Data Clustering: A Review

被引:0
|
作者
Shirkhorshidi, Ali Seyed [1 ]
Aghabozorgi, Saeed [1 ]
Teh, Ying Wah [1 ]
Herawan, Tutut [1 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Informat Syst, Kuala Lumpur 50603, Malaysia
关键词
Big Data; Clustering; MapReduce; Parallel Clustering; WAY PARTITIONING SCHEME; EXCEPTION RULES; ALGORITHM; DBSCAN; FUZZY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is an essential data mining and tool for analyzing big data. There are difficulties for applying clustering techniques to big data duo to new challenges that are raised with big data. As Big Data is referring to terabytes and petabytes of data and clustering algorithms are come with high computational costs, the question is how to cope with this problem and how to deploy clustering techniques to big data and get the results in a reasonable time. This study is aimed to review the trend and progress of clustering algorithms to cope with big data challenges from very first proposed algorithms until today's novel solutions. The algorithms and the targeted challenges for producing improved clustering algorithms are introduced and analyzed, and afterward the possible future path for more advanced algorithms is illuminated based on today's available technologies and frameworks.
引用
收藏
页码:707 / 720
页数:14
相关论文
共 50 条
  • [41] A Survey of Clustering Techniques for Big Data Analysis
    Arora, Saurabh
    Chana, Inderveer
    [J]. 2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 59 - 65
  • [42] The research on clustering algorithms in big data analysis
    Liu, Weigang
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 127 : 75 - 75
  • [43] Big Data Clustering based on Summary Statistics
    Fu, Junsong
    Liu, Yun
    Zhang, Zhenjiang
    Xiong, Fei
    [J]. 2015 FIRST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE THEORY, SYSTEMS AND APPLICATIONS (CCITSA 2015), 2015, : 87 - 91
  • [44] Approximate Clustering Ensemble Method for Big Data
    Mahmud, Mohammad Sultan
    Huang, Joshua Zhexue
    Ruby, Rukhsana
    Ngueilbaye, Alladoumbaye
    Wu, Kaishun
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (04) : 1142 - 1155
  • [45] Exploring Big Data with Scalable Soft Clustering
    Hall, Lawrence O.
    [J]. SYNERGIES OF SOFT COMPUTING AND STATISTICS FOR INTELLIGENT DATA ANALYSIS, 2013, 190 : 11 - 15
  • [46] A survey on parallel clustering algorithms for Big Data
    Dafir, Zineb
    Lamari, Yasmine
    Slaoui, Said Chah
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (04) : 2411 - 2443
  • [47] Batch Clustering Algorithm for Big Data Sets
    Alguliyev, Rasim
    Aliguliyev, Ramiz
    Bagirov, Adil
    Karimov, Rafael
    [J]. 2016 IEEE 10TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2016, : 79 - 82
  • [48] Fuzzy Consensus Clustering With Applications on Big Data
    Wu, Junjie
    Wu, Zhiang
    Cao, Jie
    Liu, Hongfu
    Chen, Guoqing
    Zhang, Yanchun
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2017, 25 (06) : 1430 - 1445
  • [49] Kernel Spectral Clustering for Big Data Networks
    Mall, Raghvendra
    Langone, Rocco
    Suykens, Johan A. K.
    [J]. ENTROPY, 2013, 15 (05) : 1567 - 1586
  • [50] Data Clustering Using Big Bang-Big Crunch Algorithm
    Hatamlou, Abdolreza
    Abdullah, Salwani
    Hatamlou, Masumeh
    [J]. INNOVATIVE COMPUTING TECHNOLOGY, 2011, 241 : 383 - +