An efficient sampling-based visualization technique for big data clustering with crisp partitions

被引:4
|
作者
Rajendra Prasad, K. [1 ]
Mohammed, Moulana [2 ]
Narasimha Prasad, L. V. [3 ]
Anguraj, Dinesh Kumar [2 ]
机构
[1] Rajeev Gandhi Mem Coll Engn & Technol, Dept CSE, Nandyal, Andhra Pradesh, India
[2] Koneru Lakshmaiah Educ Fdn, Comp Sci & Engn, Guntur, Andhra Pradesh, India
[3] Inst Aeronaut Engn, Dept CSE, Hyderabad, Telangana, India
关键词
Cluster tendency; Visualization techniques; Data clustering; Crisp partitions; Sampling;
D O I
10.1007/s10619-021-07324-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The data cluster tendency is an emerging need for exploring the big data cluster analysis tasks. The data are evaluated based on the number of clusters is known as cluster tendency. Many visualization techniques have been developed for the detection of cluster tendency. Some of the existing techniques include Visual Assessment Tendency (VAT), spectral-based VAT (SpecVAT), and improved VAT (iVAT), are considerably succeeded for an assessment of cluster tendency for small datasets. A bigVAT is another method that was recently developed for the estimation of cluster tendency of big data. It is perfect for deriving the clustering tendency in visual form for big data. However, it is intractable to explore the data clusters for large volumes of data objects. The proposed work addresses the clustering problem of bigVAT with the derivation of sampling-based crisp partitions. The crisp partitions will accurately predict the cluster labels of data objects. This research is based on big synthetic and big real-life datasets for demonstrating the performance efficiency of the proposed work.
引用
收藏
页码:813 / 832
页数:20
相关论文
共 50 条
  • [41] An Efficient Parallel Algorithm for Clustering Big Data based on the Spark Framework
    Faculty of Science of Rabat, Mohammed V University, Rabat, Morocco
    Intl. J. Adv. Comput. Sci. Appl., 7 (890-896):
  • [42] An Efficient Parallel Algorithm for Clustering Big Data based on the Spark Framework
    Dafir, Zineb
    Slaoui, Said
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (07) : 890 - 896
  • [43] A Sampling-Based Graph Clustering Algorithm for Large-Scale Networks
    Zhang J.-P.
    Chen H.-C.
    Wang K.
    Zhu K.-J.
    Wang Y.-W.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (08): : 1731 - 1737
  • [44] A sampling-based exact algorithm for the solution of the minimax diameter clustering problem
    Aloise, Daniel
    Contardo, Claudio
    JOURNAL OF GLOBAL OPTIMIZATION, 2018, 71 (03) : 613 - 630
  • [45] A sampling-based exact algorithm for the solution of the minimax diameter clustering problem
    Daniel Aloise
    Claudio Contardo
    Journal of Global Optimization, 2018, 71 : 613 - 630
  • [46] Performance evaluation of sampling-based large-scale clustering algorithms
    Olukanmi, Peter O.
    Nelwamondo, Fulufhelo
    Marwala, Tshilidzi
    2019 SOUTHERN AFRICAN UNIVERSITIES POWER ENGINEERING CONFERENCE/ROBOTICS AND MECHATRONICS/PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA (SAUPEC/ROBMECH/PRASA), 2019, : 194 - 199
  • [47] Big Data Landscapes: Improving the Visualization of Machine Learning-based Clustering Algorithms
    Kammer, Dietrich
    Keck, Mandy
    Gruender, Thomas
    Groh, Rainer
    AVI'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON ADVANCED VISUAL INTERFACES, 2018,
  • [48] Efficient sampling-based Bayesian Active Learning for synaptic characterization
    Gontier, Camille
    Surace, Simone Carlo
    Delvendahl, Igor
    Mueller, Martin
    Pfister, Jean-Pascal
    PLOS COMPUTATIONAL BIOLOGY, 2023, 19 (08)
  • [49] Research on Data Visualization Based on Big Data
    Xu, Shasha
    Zheng, Kouquan
    Yang, Wenjing
    Sun, Yanming
    2019 4TH INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2019), 2019, : 281 - 285
  • [50] Efficient Sampling-based Bottleneck Pathfinding over Cost Maps
    Solovey, Kiril
    Halperin, Dan
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 2003 - 2009