An efficient sampling-based visualization technique for big data clustering with crisp partitions

被引:4
|
作者
Rajendra Prasad, K. [1 ]
Mohammed, Moulana [2 ]
Narasimha Prasad, L. V. [3 ]
Anguraj, Dinesh Kumar [2 ]
机构
[1] Rajeev Gandhi Mem Coll Engn & Technol, Dept CSE, Nandyal, Andhra Pradesh, India
[2] Koneru Lakshmaiah Educ Fdn, Comp Sci & Engn, Guntur, Andhra Pradesh, India
[3] Inst Aeronaut Engn, Dept CSE, Hyderabad, Telangana, India
关键词
Cluster tendency; Visualization techniques; Data clustering; Crisp partitions; Sampling;
D O I
10.1007/s10619-021-07324-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The data cluster tendency is an emerging need for exploring the big data cluster analysis tasks. The data are evaluated based on the number of clusters is known as cluster tendency. Many visualization techniques have been developed for the detection of cluster tendency. Some of the existing techniques include Visual Assessment Tendency (VAT), spectral-based VAT (SpecVAT), and improved VAT (iVAT), are considerably succeeded for an assessment of cluster tendency for small datasets. A bigVAT is another method that was recently developed for the estimation of cluster tendency of big data. It is perfect for deriving the clustering tendency in visual form for big data. However, it is intractable to explore the data clusters for large volumes of data objects. The proposed work addresses the clustering problem of bigVAT with the derivation of sampling-based crisp partitions. The crisp partitions will accurately predict the cluster labels of data objects. This research is based on big synthetic and big real-life datasets for demonstrating the performance efficiency of the proposed work.
引用
收藏
页码:813 / 832
页数:20
相关论文
共 50 条
  • [1] An efficient sampling-based visualization technique for big data clustering with crisp partitions
    K. Rajendra Prasad
    Moulana Mohammed
    L. V. Narasimha Prasad
    Dinesh Kumar Anguraj
    Distributed and Parallel Databases, 2021, 39 : 813 - 832
  • [2] Sampling-Based Consensus Fuzzy Clustering on Big Data
    Zoghlami, Mohamed Ali
    Sassi Hidri, Minyar
    Ben Ayed, Rahma
    2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 1501 - 1508
  • [3] Centrality Clustering-Based Sampling for Big Data Visualization
    Tam Thanh Nguyen
    Song, Insu
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1911 - 1917
  • [4] Sampling-based visual assessment computing techniques for an efficient social data clustering
    M. Suleman Basha
    S. K. Mouleeswaran
    K. Rajendra Prasad
    The Journal of Supercomputing, 2021, 77 : 8013 - 8037
  • [5] Sampling-based visual assessment computing techniques for an efficient social data clustering
    Basha, M. Suleman
    Mouleeswaran, S. K.
    Prasad, K. Rajendra
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (08): : 8013 - 8037
  • [6] A sampling-based approach for efficient clustering in large datasets
    Exarchakis, Georgios
    Oubari, Omar
    Lenz, Gregor
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12393 - 12402
  • [7] Sampling-based approximate skyline calculation on big data
    Xiao, Xingxing
    Li, Jianzhong
    DISCRETE MATHEMATICS ALGORITHMS AND APPLICATIONS, 2022, 14 (07)
  • [8] Efficient Sampling-based ADMM for Distributed Data
    Wang, Jun-Kun
    Lin, Shou-De
    PROCEEDINGS OF 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, (DSAA 2016), 2016, : 321 - 330
  • [9] An Efficient Clustering Technique for Big Data Mining
    Banait, Satish S.
    Sane, S. S.
    Talekar, Sopan A.
    INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2022, 13 (03): : 702 - 717
  • [10] A novel sampling-based visual topic models with computational intelligence for big social health data clustering
    Narasimhulu, K.
    Abarna, K. T. Meena
    Kumar, B. Siva
    Suresh, T.
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (07): : 9619 - 9641