Big Data Clustering: A Review

被引:0
|
作者
Shirkhorshidi, Ali Seyed [1 ]
Aghabozorgi, Saeed [1 ]
Teh, Ying Wah [1 ]
Herawan, Tutut [1 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Informat Syst, Kuala Lumpur 50603, Malaysia
关键词
Big Data; Clustering; MapReduce; Parallel Clustering; WAY PARTITIONING SCHEME; EXCEPTION RULES; ALGORITHM; DBSCAN; FUZZY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is an essential data mining and tool for analyzing big data. There are difficulties for applying clustering techniques to big data duo to new challenges that are raised with big data. As Big Data is referring to terabytes and petabytes of data and clustering algorithms are come with high computational costs, the question is how to cope with this problem and how to deploy clustering techniques to big data and get the results in a reasonable time. This study is aimed to review the trend and progress of clustering algorithms to cope with big data challenges from very first proposed algorithms until today's novel solutions. The algorithms and the targeted challenges for producing improved clustering algorithms are introduced and analyzed, and afterward the possible future path for more advanced algorithms is illuminated based on today's available technologies and frameworks.
引用
收藏
页码:707 / 720
页数:14
相关论文
共 50 条
  • [1] A Review of Clustering Algorithms for Big Data
    Djouzi, Kheyreddine
    Beghdad-Bey, Kadda
    [J]. 2019 4TH INTERNATIONAL CONFERENCE ON NETWORKING AND ADVANCED SYSTEMS (ICNAS 2019), 2019, : 117 - 122
  • [2] Iterative big data clustering algorithms: a review
    Mohebi, Amin
    Aghabozorgi, Saeed
    Teh Ying Wah
    Herawan, Tutut
    Yahyapour, Ramin
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2016, 46 (01): : 107 - 129
  • [3] Scalable Clustering Algorithms for Big Data: A Review
    Mahdi, Mahmoud A.
    Hosny, Khalid M.
    Elhenawy, Ibrahim
    [J]. IEEE ACCESS, 2021, 9 : 80015 - 80027
  • [4] Different Clustering Algorithms for Big Data Analytics: A Review
    Dave, Meenu
    Gianey, Hemant
    [J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON SYSTEM MODELING & ADVANCEMENT IN RESEARCH TRENDS (SMART-2016), 2016, : 328 - 333
  • [5] Big data clustering techniques based on Spark: a literature review
    Saeed, Mozamel M.
    Al Aghbari, Zaher
    Alsharidah, Mohammed
    [J]. PEERJ COMPUTER SCIENCE, 2020,
  • [6] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [7] Big Data and Clustering Algorithms
    Ajin, V. W.
    Kumar, Lekshmy D.
    [J]. 2016 INTERNATIONAL CONFERENCE ON RESEARCH ADVANCES IN INTEGRATED NAVIGATION SYSTEMS (RAINS), 2016,
  • [8] Strategies for Big Data Clustering
    Kurasova, Olga
    Marcinkevicius, Virginijus
    Medvedev, Viktor
    Rapecka, Aurimas
    Stefanovic, Pavel
    [J]. 2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 740 - 747
  • [9] Consensus Clustering on Big Data
    Liu, Hongfu
    Cheng, Gong
    Wu, Junjie
    [J]. 2015 12TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM), 2015,
  • [10] Big Data clustering validity
    Tlili, Monia
    Hamdani, Tarek M.
    [J]. 2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 348 - 352