Fast and Memory-Efficient Approximate Minimum Spanning Tree Generation for Large Datasets

被引:0
|
作者
Almansoori, Mahmood K. M. [1 ]
Meszaros, Andras [1 ,2 ]
Telek, Miklos [1 ,2 ]
机构
[1] Budapest Univ Technol & Econ, Dept Networked Syst & Serv, Budapest, Hungary
[2] ELKH BME Informat Syst Res Grp, Budapest, Hungary
关键词
Large datasets; High-dimensional data; Approximate MST; Memory-efficient; NEAREST-NEIGHBORS; DECOMPOSITION; GRAPH;
D O I
10.1007/s13369-024-08974-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Conventional minimum spanning tree (MST) algorithms typically start by creating a distance matrix of the n ( n - 1 ) / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n(n-1)/2$$\end{document} pairs of data points, leading to a time complexity of O ( n 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n<^>2)$$\end{document} . This initial step poses a computational bottleneck. To overcome this limitation, we present a novel method that constructs an initial random k-neighbor graph and optimizes this graph by employing a crawling technique to efficiently approximate the k Nearest Neighbors (kNN) graph. This crawling approach allows us to approximate the closest neighbors of each node. Subsequently, the approximate kNN graph is utilized to build an initial approximate MST and iteratively refine it by the same crawling process. Using this approach, an approximate MST can be obtained for a data set of size n with empirical cost around O ( n 1.07 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n<^>{1.07})$$\end{document} and a minimal O(n) memory consumption. In contrast to spatial tree-based approaches, the presented method also scales well to high dimensional data. We have shown that the proposed approach achieves such a level of performance with only a marginal accuracy reduction between 0.5% and 6%. These qualities make it an excellent candidate for approximate MST calculation for high-dimensional, large data sets.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Fast and memory-efficient minimum spanning tree on the
    Rostrup, Scott
    Srivastava, Shweta
    Singhal, Kishore
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2013, 8 (01) : 21 - 33
  • [2] Fast approximate minimum spanning tree based clustering algorithm
    Jothi, R.
    Mohanty, Sraban Kumar
    Ojha, Aparajita
    [J]. NEUROCOMPUTING, 2018, 272 : 542 - 557
  • [3] MINIMUM SPANNING TREE GENERATION WITH CONTENT-ADDRESSABLE MEMORY
    PARK, TG
    OLDFIELD, JV
    [J]. ELECTRONICS LETTERS, 1993, 29 (11) : 1037 - 1039
  • [4] Analysis of a memory-efficient self-stabilizing BFS spanning tree construction
    Datta, Ajoy K.
    Devismes, Stephane
    Johnen, Colette
    Larmore, Lawrence L.
    [J]. THEORETICAL COMPUTER SCIENCE, 2023, 955
  • [5] Fast Approximate Minimum Spanning Tree Algorithm Based on K-Means
    Zhong, Caiming
    Malinen, Mikko
    Miao, Duoqian
    Franti, Pasi
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PT I, 2013, 8047 : 262 - 269
  • [6] Efficient construction of an approximate similarity graph for minimum spanning tree based clustering
    Mishra, Gaurav
    Mohanty, Sraban Kumar
    [J]. APPLIED SOFT COMPUTING, 2020, 97
  • [7] A FAST ALGORITHM FOR THE MINIMUM SPANNING TREE
    SURAWEERA, F
    [J]. COMPUTERS IN INDUSTRY, 1989, 13 (02) : 181 - 185
  • [8] Memory-efficient enumeration of constrained spanning trees
    Nievergelt, J
    Deo, N
    Marzetta, A
    [J]. INFORMATION PROCESSING LETTERS, 1999, 72 (1-2) : 47 - 53
  • [9] An Efficient Minimum Spanning Tree Algorithm
    Abdullah-Al Mamun
    Rajasekaran, Sanguthevar
    [J]. 2016 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATION (ISCC), 2016, : 1047 - 1052
  • [10] Fast reoptimization for the minimum spanning tree problem
    Boria, Nicolas
    Paschos, Vangelis Th.
    [J]. JOURNAL OF DISCRETE ALGORITHMS, 2010, 8 (03) : 296 - 310