Kernelized Spectral Clustering based Conditional MapReduce function with big data

被引:1
|
作者
Maheswari K. [1 ]
Ramakrishnan M. [2 ]
机构
[1] Department of Computer Science, Bharathiar University, Coimbatore
[2] School of Information Technology, Madurai Kamaraj University, Madurai
关键词
Big data analytics; clustering; Conditional Maximum Entropy MapReduce; dimensionality reduction; irrelevant data; Kernelized Spectral Clustering;
D O I
10.1080/1206212X.2019.1587892
中图分类号
学科分类号
摘要
Clustering is the significant data mining technique for big data analysis, where large volume data are grouped. The resulting of clustering is to minimize the dimensionality while accessing large volume of data. The several data mining techniques have been developed for clustering the data. But the problem of clustering becomes increasing rapidly in recent years since the existing clustering algorithm failed to minimize the clustering time and majority of techniques require huge memory to perform clustering task. In order to improve clustering accuracy and minimize the dimensionality, a Kernelized Spectral Clustering based Conditional Maximum Entropy MapReduce (KSC-CMEMR) technique is introduced. The number of data is collected from big dataset. The KSC-CMEMR technique partitions the data into different clusters using Kernelized Spectral Clustering Process based on the spectrum of similarity matrix and to perform dimensionality reduction. Based on the similarity, the Kernelized Spectral Clustering is carried out with higher clustering accuracy. After that, Conditional Maximum Entropy MapReduce model eliminates the irrelevant data present in the cluster. The designed model predicts the maximum probabilities of data become a member of the cluster and remove the irrelevant data from the cluster. This helps to reduce the false positive and space complexity. Experimental evaluation is carried out with certain parameters such as clustering accuracy, clustering time, false positive rate, and space complexity with respect to the number of data. The experimental results reported that the proposed KSC-CMEMR technique obtains high clustering accuracy with minimum time as well as space complexity. © 2019 Informa UK Limited, trading as Taylor & Francis Group.
引用
收藏
页码:601 / 611
页数:10
相关论文
共 50 条
  • [31] Efficient MapReduce Kernel k-Means for Big Data Clustering
    Tsapanos, Nikolaos
    Tefas, Anastasios
    Nikolaidis, Nikolaos
    Pitas, Ioannis
    9TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2016), 2016,
  • [32] Utilizing the Buckshot Algorithm for Efficient Big Data Clustering in the MapReduce Model
    Gerakidis, Sergios
    Mamalis, Basilis
    PROCEEDINGS OF THE 23RD PAN-HELLENIC CONFERENCE OF INFORMATICS (PCI 2019), 2019, : 112 - 117
  • [33] Kernel Spectral Clustering for Big Data Networks
    Mall, Raghvendra
    Langone, Rocco
    Suykens, Johan A. K.
    ENTROPY, 2013, 15 (05) : 1567 - 1586
  • [34] KS-cluster: A spectral clustering method based on kernelized sparse representation for document clustering
    Xing, Jieqing
    Wang, Chunteng
    ICIC Express Letters, 2015, 9 (10): : 2801 - 2806
  • [35] Atrak: a MapReduce-based data warehouse for big data
    Barkhordari, Mohammadhossein
    Niamanesh, Mahdi
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
  • [36] Atrak: a MapReduce-based data warehouse for big data
    Mohammadhossein Barkhordari
    Mahdi Niamanesh
    The Journal of Supercomputing, 2017, 73 : 4596 - 4610
  • [37] Parallel Fuzzy C-Means Clustering Based Big Data Anonymization Using Hadoop MapReduce
    Lawrance, Josephine Usha
    Jesudhasan, Jesu Vedha Nayahi
    Rittammal, Jerald Beno Thampiraj
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 135 (04) : 2103 - 2130
  • [38] Parallel Processing of Big Data using Power Iteration Clustering over MapReduce
    Jayalatchumy, D.
    Thambidurai, P.
    Alamelu, A. Vasumathi
    2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014), 2014, : 176 - 178
  • [39] A MapReduce-Based ELM for Regression in Big Data
    Wu, B.
    Yan, T. H.
    Xu, X. S.
    He, B.
    Li, W. H.
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
  • [40] Study on Cloud Storage based on the MapReduce for Big Data
    Huang Yi
    Ma Xinqiang
    Zhang Yongdan
    Liu Youyuan
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON MECHATRONICS, ELECTRONIC, INDUSTRIAL AND CONTROL ENGINEERING, 2015, 8 : 1601 - 1605