An efficient sampling-based visualization technique for big data clustering with crisp partitions

被引:4
|
作者
Rajendra Prasad, K. [1 ]
Mohammed, Moulana [2 ]
Narasimha Prasad, L. V. [3 ]
Anguraj, Dinesh Kumar [2 ]
机构
[1] Rajeev Gandhi Mem Coll Engn & Technol, Dept CSE, Nandyal, Andhra Pradesh, India
[2] Koneru Lakshmaiah Educ Fdn, Comp Sci & Engn, Guntur, Andhra Pradesh, India
[3] Inst Aeronaut Engn, Dept CSE, Hyderabad, Telangana, India
关键词
Cluster tendency; Visualization techniques; Data clustering; Crisp partitions; Sampling;
D O I
10.1007/s10619-021-07324-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The data cluster tendency is an emerging need for exploring the big data cluster analysis tasks. The data are evaluated based on the number of clusters is known as cluster tendency. Many visualization techniques have been developed for the detection of cluster tendency. Some of the existing techniques include Visual Assessment Tendency (VAT), spectral-based VAT (SpecVAT), and improved VAT (iVAT), are considerably succeeded for an assessment of cluster tendency for small datasets. A bigVAT is another method that was recently developed for the estimation of cluster tendency of big data. It is perfect for deriving the clustering tendency in visual form for big data. However, it is intractable to explore the data clusters for large volumes of data objects. The proposed work addresses the clustering problem of bigVAT with the derivation of sampling-based crisp partitions. The crisp partitions will accurately predict the cluster labels of data objects. This research is based on big synthetic and big real-life datasets for demonstrating the performance efficiency of the proposed work.
引用
收藏
页码:813 / 832
页数:20
相关论文
共 50 条
  • [31] Density Based Clustering Technique For Efficient Data Mining
    Rahman, Md Asikur
    Chowdhury, A. K. M. Rasheduzzaman
    Rahman, Daud Md Jamilur
    Kamal, Abu Raihan Mostofa
    2008 11TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY: ICCIT 2008, VOLS 1 AND 2, 2008, : 706 - 710
  • [32] A Framework for Sampling-Based XML Data Pricing
    Tang, Ruiming
    Amarilli, Antoine
    Senellart, Pierre
    Bressan, Stephane
    TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE-CENTERED SYSTEMS XXIV, 2016, 9510 : 116 - 138
  • [33] Hybrid Sampling-Based Clustering Ensemble With Global and Local Constitutions
    Yang, Yun
    Jiang, Jianmin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (05) : 952 - 965
  • [34] Sampling-Based and Gradient-Based Efficient Scenario Generation
    Kudalkar, Vidisha
    Hashemi, Navid
    Mukhopadhya, Shilpa
    Mallick, Swapnil
    Budnik, Christof
    Nagaraja, Parinitha
    Deshmukh, Lyotirmoy, V
    RUNTIME VERIFICATION, RV 2024, 2025, 15191 : 70 - 88
  • [35] Research on belt and road big data visualization based on text clustering algorithm
    Wen, Yana
    Wei, Tingyue
    Cui, Kewei
    Ling, Bai
    Zhang, Yahao
    Huang, Meng
    ACM International Conference Proceeding Series, 2020, : 121 - 125
  • [36] Progressive Clustering of Big Data with GPU Acceleration and Visualization
    Wang, Jun
    Papenhausen, Eric
    Wang, Bing
    Ha, Sungsoo
    Zelenyuk, Alla
    Mueller, Klaus
    2017 NEW YORK SCIENTIFIC DATA SUMMIT (NYSDS), 2017,
  • [37] An Efficient Sampling-Based Attention Network for Semantic Segmentation
    He, Xingjian
    Liu, Jing
    Wang, Weining
    Lu, Hanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 2850 - 2863
  • [38] An Efficient Sampling-Based Hybrid A* Algorithm for Intelligent Vehicles
    Li, Gengxin
    Xue, Jianru
    Zhang, Lin
    Wang, Di
    Li, Yongqiang
    Tao, Zhongxing
    Zheng, Nanning
    2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 2104 - 2109
  • [39] An efficient deterministic sequence for sampling-based motion planners
    Rosell, J
    Heise, M
    ISATP 2005: IEEE INTERNATIONAL SYMPOSIUM ON ASSEMBLY AND TASK PLANNING (ISATP), 2005, : 212 - 217
  • [40] An Efficient Inclusive Similarity Based Clustering (ISC) Algorithm for Big Data
    Sangeetha, J.
    Prakash, V. Sinthu Janita
    2017 2ND WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT), 2017, : 84 - 88