Big Data Clustering: A Review

被引:0
|
作者
Shirkhorshidi, Ali Seyed [1 ]
Aghabozorgi, Saeed [1 ]
Teh, Ying Wah [1 ]
Herawan, Tutut [1 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Informat Syst, Kuala Lumpur 50603, Malaysia
关键词
Big Data; Clustering; MapReduce; Parallel Clustering; WAY PARTITIONING SCHEME; EXCEPTION RULES; ALGORITHM; DBSCAN; FUZZY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is an essential data mining and tool for analyzing big data. There are difficulties for applying clustering techniques to big data duo to new challenges that are raised with big data. As Big Data is referring to terabytes and petabytes of data and clustering algorithms are come with high computational costs, the question is how to cope with this problem and how to deploy clustering techniques to big data and get the results in a reasonable time. This study is aimed to review the trend and progress of clustering algorithms to cope with big data challenges from very first proposed algorithms until today's novel solutions. The algorithms and the targeted challenges for producing improved clustering algorithms are introduced and analyzed, and afterward the possible future path for more advanced algorithms is illuminated based on today's available technologies and frameworks.
引用
收藏
页码:707 / 720
页数:14
相关论文
共 50 条
  • [21] Clustering Algorithms for Spatial Big Data
    Schoier, Gabriella
    Gregorio, Caterina
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2017, PT IV, 2017, 10407 : 571 - 583
  • [22] Survey on clustering methods : Towards fuzzy clustering for big data
    Ben Ayed, Abdelkarim
    Ben Halima, Mohamed
    Alimi, Adel M.
    [J]. 2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 331 - 336
  • [23] The Review of Big Data
    Shi, Chunhe
    Wu, Chengdong
    Han, Xiaowei
    Li, Zhen
    Xie, Yinghong
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ELECTRONIC, MECHANICAL, INFORMATION AND MANAGEMENT SOCIETY (EMIM), 2016, 40 : 108 - 112
  • [24] Big Data: A Review
    Sagiroglu, Seref
    Sinanc, Duygu
    [J]. PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2013, : 42 - 47
  • [25] Data clustering: A review
    Jain, AK
    Murty, MN
    Flynn, PJ
    [J]. ACM COMPUTING SURVEYS, 1999, 31 (03) : 264 - 323
  • [26] Continuous Clustering in Big Data Learning Analytics
    Govindarajan, Kannan
    Somasundaram, Thamarai Selvi
    Kumar, Vivekanandan S.
    Kinshuk
    [J]. 2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON TECHNOLOGY FOR EDUCATION (T4E 2013), 2013, : 61 - 64
  • [27] Big-Data Clustering with Genetic Algorithm
    Mortezanezhad, Afsaneh
    Daneshifar, Ebrahim
    [J]. 2019 IEEE 5TH CONFERENCE ON KNOWLEDGE BASED ENGINEERING AND INNOVATION (KBEI 2019), 2019, : 702 - 706
  • [28] A Clustering Based Anonymization Model for Big Data
    Canbay, Yavuz
    Kalyoncu, Aydincan
    Ercimen, Mucahid
    Dogan, Adem
    Sagiroglu, Seref
    [J]. 2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 720 - 725
  • [29] Fast and effective Big Data exploration by clustering
    Ianni, Michele
    Masciari, Elio
    Mazzeo, Giuseppe M.
    Mezzanzanica, Mario
    Zaniolo, Carlo
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 102 : 84 - 94
  • [30] Research on incremental clustering algorithm for big data
    Yang X.
    [J]. Applied Mathematics and Nonlinear Sciences, 2023, 8 (02) : 169 - 180