Big data clustering with varied density based on MapReduce

被引：31

作者：

Heidari, Safanaz ^{[1
]}

Alborzi, Mahmood ^{[1
]}

Radfar, Reza ^{[1
]}

Afsharkazemi, Mohammad Ali ^{[2
]}

Ghatari, Ali Rajabzadeh ^{[3
]}

机构：

[1] Islamic Azad Univ, Dept Informat Technol Management, Sci & Res Branch, Tehran, Iran

[2] Islamic Azad Univ, Dept Ind Management, Cent Tehran Branch, Tehran, Iran

[3] Tarbiat Modares Univ, Dept Management, Tehran, Iran

来源：

JOURNAL OF BIG DATA | 2019年 / 6卷 / 01期

关键词：

Map-Reduce; Density-based clustering; Big data; ALGORITHM;

D O I：

10.1186/s40537-019-0236-x

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The DBSCAN algorithm is a prevalent method of density-based clustering algorithms, the most important feature of which is the ability to detect arbitrary shapes and varied clusters and noise data. Nevertheless, this algorithm faces a number of challenges, including failure to find clusters of varied densities. On the other hand, with the rapid development of the information age, plenty of data are produced every day, such that a single machine alone cannot process this volume of data; hence, new technologies are required to store and extract information from this volume of data. A large volume of data that is beyond the capabilities of existing software is called Big data. In this paper, we have attempted to introduce a new algorithm for clustering big data with varied density using a Hadoop platform running MapReduce. The main idea of this research is the use of local density to find each point's density. This strategy can avoid the situation of connecting clusters with varying densities. The proposed algorithm is implemented and compared with other algorithms using the MapReduce paradigm and shows the best varying density clustering capability and scalability.

引用

页数：16

共 50 条

[41] VDMR-DBSCAN: Varied Density MapReduce DBSCAN
Bhardwaj, Surbhi
Dash, Subrat Kumar
BIG DATA ANALYTICS, BDA 2015, 2015, 9498 : 134 - 150
[42] Parallel grid-based density peak clustering of big trajectory data
Xinzheng Niu
Yunhong Zheng
Philippe Fournier-Viger
Bing Wang
Applied Intelligence, 2022, 52 : 17042 - 17057
[43] Parallel grid-based density peak clustering of big trajectory data
Niu, Xinzheng
Zheng, Yunhong
Fournier-Viger, Philippe
Wang, Bing
APPLIED INTELLIGENCE, 2022, 52 (15) : 17042 - 17057
[44] A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
Ghaemi, Zeinab
Farnaghi, Mandi
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2019, 8 (02):
[45] MapReduce framework based big data clustering using fractional integrated sparse fuzzy C means algorithm
Kulkarni, Omkaresh
Jena, Sudarson
Ravi Sankar, V.
IET IMAGE PROCESSING, 2020, 14 (12) : 2719 - 2727
[46] DBCURE-MR: An efficient density-based clustering algorithm for large data using MapReduce
Kim, Younghoon
Shim, Kyuseok
Kim, Min-Soeng
Lee, June Sup
INFORMATION SYSTEMS, 2014, 42 : 15 - 35
[47] VDBSCAN: Varied density based spatial clustering of applications with noise
Liu, Peng
Zhou, Doug
Wu, Naijun
2007 INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT, VOLS 1-3, 2007, : 528 - +
[48] Unsupervised varied density based clustering algorithm using spline
Louhichi, Soumaya
Gzara, Mariem
Ben-Abdallah, Hanene
PATTERN RECOGNITION LETTERS, 2017, 93 : 48 - 57
[49] MapReduce based Classification for Fault Detection in Big Data Applications
Shafiq, M. Omair
Fekri, Maryam
Ibrahim, Rami
2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 637 - 642
[50] Varied Density Based Graph Clustering Algorithm for Social Networks
Sowjanya, M. Venkata
Padmaja, T. Maruthi
2017 INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC), 2017, : 520 - 524

← 1 2 3 4 5 →