EFFICIENT ASTRONOMICAL DATA CONDENSATION USING APPROXIMATE NEAREST NEIGHBORS

被引:2
|
作者
Lukasik, Szymon [1 ,2 ]
Lalik, Konrad [1 ]
Sarna, Piotr [1 ]
Kowalski, Piotr A. [1 ,2 ]
Charytanowicz, Malgorzata [2 ,3 ]
Kulczycki, Piotr [1 ,2 ]
机构
[1] AGH Univ Sci & Technol, Fac Phys & Appl Comp Sci, Al Mickiewicza 30, PL-30059 Krakow, Poland
[2] Polish Acad Sci, Syst Res Inst, Ul Newelska 6, PL-01447 Warsaw, Poland
[3] Lublin Univ Technol, Fac Elect Engn & Comp Sci, Ul Nadbystrzycka 38D, PL-20618 Lublin, Poland
关键词
big data; astronomy; data reduction; nearest neighbor search; kd-trees; DATA REDUCTION;
D O I
10.2478/amcs-2019-0034
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extracting useful information from astronomical observations represents one of the most challenging tasks of data exploration. This is largely due to the volume of the data acquired using advanced observational tools. While other challenges typical for the class of big data problems (like data variety) are also present, the size of datasets represents the most significant obstacle in visualization and subsequent analysis. This paper studies an efficient data condensation algorithm aimed at providing its compact representation. It is based on fast nearest neighbor calculation using tree structures and parallel processing. In addition to that, the possibility of using approximate identification of neighbors, to even further improve the algorithm time performance, is also evaluated. The properties of the proposed approach, both in terms of performance and condensation quality, are experimentally assessed on astronomical datasets related to the GAIA mission. It is concluded that the introduced technique might serve as a scalable method of alleviating the problem of the dataset size.
引用
收藏
页码:467 / 476
页数:10
相关论文
共 50 条
  • [31] Optimized high order product quantization for approximate nearest neighbors search
    Li, Linhao
    Hu, Qinghua
    FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (02) : 259 - 272
  • [32] OCR enhancement through neighbor embedding and fast approximate nearest neighbors
    Smith, D. C.
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXV, 2012, 8499
  • [33] Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors
    Baranchuk, Dmitry
    Babenko, Artem
    Malkov, Yury
    COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 209 - 224
  • [34] Accelerated Approximate Nearest Neighbors Search Through Hierarchical Product Quantization
    Abdelhadi, Ameer M. S.
    Bouganis, Christos-Savvas
    Constantinides, George A.
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 90 - 98
  • [35] AN APPROXIMATE CLUSTERING TECHNIQUE BASED ON THE K-NEAREST NEIGHBORS METHOD
    KOVALENKO, AP
    AUTOMATION AND REMOTE CONTROL, 1992, 53 (10) : 1592 - 1598
  • [36] Efficient Nearest Neighbors via Robust Sparse Hashing
    Cherian, Anoop
    Sra, Suvrit
    Morellas, Vassilios
    Papanikolopoulos, Nikolaos
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (08) : 3646 - 3655
  • [37] Efficient kNN Classification With Different Numbers of Nearest Neighbors
    Zhang, Shichao
    Li, Xuelong
    Zong, Ming
    Zhu, Xiaofeng
    Wang, Ruili
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1774 - 1785
  • [38] Efficient nearest neighbor classification using a cascade of approximate similarity measures
    Athitsos, V
    Alon, J
    Sclaroff, S
    2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 486 - 493
  • [39] Simple and Efficient Approximate Nearest Neighbor Search using Spatial Sorting
    Malheiros, Marcelo de Gomensoro
    Walter, Marcelo
    2015 28TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES, 2015, : 180 - 187
  • [40] Incremental k-Nearest Neighbors Using Reservoir Sampling for Data Streams
    Bahri, Maroua
    Bifet, Albert
    DISCOVERY SCIENCE (DS 2021), 2021, 12986 : 122 - 137