Clustering on Big Data Using Hadoop MapReduce

被引:3
|
作者
Akthar, Nadeem [1 ]
Ahamad, Mohd Vasim [1 ]
Khan, Shahbaz [1 ]
机构
[1] Aligarh Muslim Univ, ZHCET, Dept Comp Engn, Aligarh 202002, Uttar Pradesh, India
关键词
Clustering; Big Data; K-Means Clustering; Hadoop; MapReduce; Data Mining;
D O I
10.1109/CICN.2015.161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
with the phenomenal increase in digital data, it is inefficient to run the traditional clustering algorithms on separate servers. To deal with this problem, researchers are migrating to distribute environment to implement the traditional clustering algorithms, more specifically K-means clustering. In traditional K Means Clustering, the problem of instability caused by the random initial centers exists. With random initial centres, if we execute the clustering algorithm (More specifically K-Means) on the same data set, more than once, we get different cluster results each time. Thus making the results unstable. Here, we proposed a modified K-Means clustering algorithm, which take the optimized initial centres based on data dimensional density. This approach deal with the random initial centers taken for algorithm execution and provides stable cluster results.
引用
收藏
页码:789 / 795
页数:7
相关论文
共 50 条
  • [1] Improved CURE Clustering for Big Data using Hadoop and Mapreduce
    Lathiya, Piyush
    Rani, Rinkle
    [J]. 2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 241 - 245
  • [2] Clustering of Association Rules for Big Datasets using Hadoop MapReduce
    Moahmmed, Salahadin A.
    Alasow, Mohamed A.
    El-Alfy, El-Sayed M.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 536 - 545
  • [3] Budget Constraint Scheduler for Big Data Using Hadoop MapReduce
    Vinutha D.C.
    Raju G.T.
    [J]. SN Computer Science, 2021, 2 (4)
  • [4] Parallel Fuzzy C-Means Clustering Based Big Data Anonymization Using Hadoop MapReduce
    Lawrance, Josephine Usha
    Jesudhasan, Jesu Vedha Nayahi
    Rittammal, Jerald Beno Thampiraj
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2024, 135 (04) : 2103 - 2130
  • [5] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [6] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [7] Reduced Time Compression in Big Data Using MapReduce Approach and Hadoop
    Meena, K.
    Sujatha, J.
    [J]. JOURNAL OF MEDICAL SYSTEMS, 2019, 43 (08)
  • [8] Reduced Time Compression in Big Data Using MapReduce Approach and Hadoop
    K. Meena
    J. Sujatha
    [J]. Journal of Medical Systems, 2019, 43
  • [9] Big Data Analysis of Indian Premier League using Hadoop and MapReduce
    Paul, Rajdeep
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS), 2017,
  • [10] Modeling and Analysis of Hadoop MapReduce Systems for Big Data Using Petri Nets
    Chiang, Dai-Lun
    Wang, Sheng-Kuan
    Wang, Yu-Ying
    Lin, Yi-Nan
    Hsieh, Tsang-Yen
    Yang, Cheng-Ying
    Shen, Victor R. L.
    Ho, Hung-Wei
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2021, 35 (01) : 80 - 104