Kernelized Spectral Clustering based Conditional MapReduce function with big data

被引:1
|
作者
Maheswari K. [1 ]
Ramakrishnan M. [2 ]
机构
[1] Department of Computer Science, Bharathiar University, Coimbatore
[2] School of Information Technology, Madurai Kamaraj University, Madurai
关键词
Big data analytics; clustering; Conditional Maximum Entropy MapReduce; dimensionality reduction; irrelevant data; Kernelized Spectral Clustering;
D O I
10.1080/1206212X.2019.1587892
中图分类号
学科分类号
摘要
Clustering is the significant data mining technique for big data analysis, where large volume data are grouped. The resulting of clustering is to minimize the dimensionality while accessing large volume of data. The several data mining techniques have been developed for clustering the data. But the problem of clustering becomes increasing rapidly in recent years since the existing clustering algorithm failed to minimize the clustering time and majority of techniques require huge memory to perform clustering task. In order to improve clustering accuracy and minimize the dimensionality, a Kernelized Spectral Clustering based Conditional Maximum Entropy MapReduce (KSC-CMEMR) technique is introduced. The number of data is collected from big dataset. The KSC-CMEMR technique partitions the data into different clusters using Kernelized Spectral Clustering Process based on the spectrum of similarity matrix and to perform dimensionality reduction. Based on the similarity, the Kernelized Spectral Clustering is carried out with higher clustering accuracy. After that, Conditional Maximum Entropy MapReduce model eliminates the irrelevant data present in the cluster. The designed model predicts the maximum probabilities of data become a member of the cluster and remove the irrelevant data from the cluster. This helps to reduce the false positive and space complexity. Experimental evaluation is carried out with certain parameters such as clustering accuracy, clustering time, false positive rate, and space complexity with respect to the number of data. The experimental results reported that the proposed KSC-CMEMR technique obtains high clustering accuracy with minimum time as well as space complexity. © 2019 Informa UK Limited, trading as Taylor & Francis Group.
引用
收藏
页码:601 / 611
页数:10
相关论文
共 50 条
  • [1] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [2] Data Mining Techniques for Producing Clustering in Big Data with MapReduce Function
    Presskila, X. Arogya
    Robinson, Y. Harold
    Studies in Big Data, 2021, 93 : 195 - 203
  • [3] MapReduce based Method for Big Data Semantic Clustering
    Yang, Jie
    Li, Xiaoping
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 2814 - 2819
  • [4] Big data clustering with varied density based on MapReduce
    Safanaz Heidari
    Mahmood Alborzi
    Reza Radfar
    Mohammad Ali Afsharkazemi
    Ali Rajabzadeh Ghatari
    Journal of Big Data, 6
  • [5] Big data clustering with varied density based on MapReduce
    Heidari, Safanaz
    Alborzi, Mahmood
    Radfar, Reza
    Afsharkazemi, Mohammad Ali
    Ghatari, Ali Rajabzadeh
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [6] Event Segmentation using MapReduce based Big Data Clustering
    Shafiq, M. Omair
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1857 - 1866
  • [7] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [8] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    IAENG International Journal of Applied Mathematics, 2023, 53 (01):
  • [9] Research and implementation of user clustering based on MapReduce in multimedia big data
    Tongke Fan
    Multimedia Tools and Applications, 2018, 77 : 10017 - 10031
  • [10] Research and implementation of user clustering based on MapReduce in multimedia big data
    Fan, Tongke
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (08) : 10017 - 10031