Big data clustering with varied density based on MapReduce

被引:31
|
作者
Heidari, Safanaz [1 ]
Alborzi, Mahmood [1 ]
Radfar, Reza [1 ]
Afsharkazemi, Mohammad Ali [2 ]
Ghatari, Ali Rajabzadeh [3 ]
机构
[1] Islamic Azad Univ, Dept Informat Technol Management, Sci & Res Branch, Tehran, Iran
[2] Islamic Azad Univ, Dept Ind Management, Cent Tehran Branch, Tehran, Iran
[3] Tarbiat Modares Univ, Dept Management, Tehran, Iran
关键词
Map-Reduce; Density-based clustering; Big data; ALGORITHM;
D O I
10.1186/s40537-019-0236-x
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The DBSCAN algorithm is a prevalent method of density-based clustering algorithms, the most important feature of which is the ability to detect arbitrary shapes and varied clusters and noise data. Nevertheless, this algorithm faces a number of challenges, including failure to find clusters of varied densities. On the other hand, with the rapid development of the information age, plenty of data are produced every day, such that a single machine alone cannot process this volume of data; hence, new technologies are required to store and extract information from this volume of data. A large volume of data that is beyond the capabilities of existing software is called Big data. In this paper, we have attempted to introduce a new algorithm for clustering big data with varied density using a Hadoop platform running MapReduce. The main idea of this research is the use of local density to find each point's density. This strategy can avoid the situation of connecting clusters with varying densities. The proposed algorithm is implemented and compared with other algorithms using the MapReduce paradigm and shows the best varying density clustering capability and scalability.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Big data clustering with varied density based on MapReduce
    Safanaz Heidari
    Mahmood Alborzi
    Reza Radfar
    Mohammad Ali Afsharkazemi
    Ali Rajabzadeh Ghatari
    Journal of Big Data, 6
  • [2] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [3] MapReduce based Method for Big Data Semantic Clustering
    Yang, Jie
    Li, Xiaoping
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 2814 - 2819
  • [4] Density-based Algorithms for Big Data Clustering Using MapReduce Framework: A Comprehensive Study
    Khader, Mariam
    Al-Naymat, Ghazi
    ACM COMPUTING SURVEYS, 2020, 53 (05)
  • [5] Event Segmentation using MapReduce based Big Data Clustering
    Shafiq, M. Omair
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1857 - 1866
  • [6] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [7] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    IAENG International Journal of Applied Mathematics, 2023, 53 (01):
  • [8] Research and implementation of user clustering based on MapReduce in multimedia big data
    Tongke Fan
    Multimedia Tools and Applications, 2018, 77 : 10017 - 10031
  • [9] Kernelized Spectral Clustering based Conditional MapReduce function with big data
    Maheswari K.
    Ramakrishnan M.
    International Journal of Computers and Applications, 2021, 43 (07) : 601 - 611
  • [10] Research and implementation of user clustering based on MapReduce in multimedia big data
    Fan, Tongke
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (08) : 10017 - 10031