Data-dependent metric filtering

被引:0
|
作者
Mic, Vladimir [1 ]
Zezula, Pavel [1 ]
机构
[1] Masaryk Univ, Botanicka 68a, Brno 60200, Czech Republic
关键词
Metric Space Searching; Similarity Search; Metric Filtering; Data Dependent Filtering;
D O I
10.1016/j.is.2021.101980
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Filtering is a fundamental strategy of metric similarity indexes to minimise the number of computed distances. Given a triplet of objects for which distances of two pairs are known, the lower and upper bounds on the third distance can be determined using the triangle inequality property. Obviously, tightness of the bounds is crucial for efficiency reasons - the more precise the estimation, the more distance computations can be avoided, and the more efficient the search is. We show that it is not necessary to consider arbitrary angles in triangles formed by pairwise distances of three objects, as specific range of possible angles is data dependent. When considering realistic ranges of angles, the bounds on distances can be much more tight and filtering much more effective. We formalise the problem of the data dependent estimation of bounds on distances and deeply analyse limited angles in triangles of distances. We justify the potential of the data dependent metric filtering both, analytically and experimentally, executing many distance estimations on several real-life datasets. (c) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Rewritable Channels With Data-Dependent Noise
    Mittelholzer, Thomas
    Franceschini, Michele
    Lastras-Montano, Luis A.
    Elfadel, Ibrahim M.
    Sharma, Mayank
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-8, 2009, : 2644 - +
  • [22] Data-Dependent Sparsity for Subspace Clustering
    Xin, Bo
    Wang, Yizhou
    Gao, Wen
    Wipf, David
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [23] Data-dependent jitter in serial communications
    Analui, B
    Buckwalter, JF
    Hajimiri, A
    [J]. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2005, 53 (11) : 3388 - 3397
  • [24] Semantics and Algorithms for Data-dependent Grammars
    Jim, Trevor
    Mandelbaum, Yitzhak
    Walker, David
    [J]. POPL'10: PROCEEDINGS OF THE 37TH ANNUAL ACM SIGPLAN-SIGACT SYMPOSIUM ON PRINCIPLES OF PROGRAMMING LANGUAGES, 2010, : 417 - 430
  • [25] Eigenvector Localization on Data-Dependent Graphs
    Cloninger, Alexander
    Czaja, Wojciech
    [J]. 2015 INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2015, : 608 - 612
  • [26] A cipher based on data-dependent permutations
    Moldovyan, AA
    Moldovyan, NA
    [J]. JOURNAL OF CRYPTOLOGY, 2002, 15 (01) : 61 - 72
  • [27] PCA in Sparse Data-Dependent Noise
    Vaswani, Namrata
    Narayanamurthy, Praneeth
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 641 - 645
  • [28] Contextuality of Misspecification and Data-Dependent Losses
    Grunwald, Peter
    [J]. STATISTICAL SCIENCE, 2016, 31 (04) : 495 - 498
  • [29] MULTIVARIATE HISTOGRAMS WITH DATA-DEPENDENT PARTITIONS
    Klemela, Jussi
    [J]. STATISTICA SINICA, 2009, 19 (01) : 159 - 176
  • [30] A cipher based on data-dependent permutations
    A. A. Moldovyan
    N. A. Moldovyan
    [J]. Journal of Cryptology, 2002, 15 : 61 - 72