Near neighbor searching with K nearest references

被引：18

作者：

Chavez, E. ^{[1
]}

Graff, M. ^{[3
]}

Navarro, G. ^{[2
]}

Tellez, E. S. ^{[3
]}

机构：

[1] CICESE, Mexico City, DF, Mexico

[2] Univ Chile, Dept Comp Sci, CeBiB Ctr Biotechnol & Bioengn, Santiago, Chile

[3] INFOTEC Catedra CONACyT, Mexico City, DF, Mexico

来源：

INFORMATION SYSTEMS | 2015年 / 51卷

关键词：

Proximity search; Searching by content in multimedia databases; k nearest neighbors; Indexing metric spaces; SIMILARITY SEARCH; APPROXIMATE; INDEX; REPRESENTATIONS; ALGORITHM; SPACES;

D O I：

10.1016/j.is.2015.02.001

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Proximity searching is the problem of retrieving, from a given database, those objects closest to a query. To avoid exhaustive searching, data structures called indexes are built on the database prior to serving queries. The curse of dimensionality is a well-known problem for indexes: in spaces with sufficiently concentrated distance histograms, no index outperforms an exhaustive scan of the database. In recent years, a number of indexes for approximate proximity searching have been proposed. These are able to cope with the curse of dimensionality in exchange for returning an answer that might be slightly different from the correct one. In this paper we show that many of those-recent indexes can be understood as variants of a simple general model based on K-nearest reference signatures. A set of references is chosen from the database, and the signature of each object consists of the K references nearest to the object. At query time, the signature of the query is computed and the search examines only the objects whose signature is close enough to that of the query. Many known and novel indexes are obtained by considering different ways to determine how much detail the signature records (e.g., just the set of nearest references, or also their proximity order to the object, or also their distances to the object, and so on), how the similarity between signatures is defined, and how the parameters are tuned. In addition, we introduce a space-efficient representation for those families of indexes, making it possible to search very large databases in main memory. Small indexes are cache friendly, inducing faster queries. We perform exhaustive experiments comparing several known and new indexes that derive from our framework, evaluating their time performance, memory usage, and quality of approximation. The best indexes outperform the state of the art, offering an attractive balance between all these aspects, and turn out to be excellent choices in many scenarios. Our framework gives high flexibility to design new indexes. (C) 2015 Elsevier Ltd. All rights reserved.

引用

页码：43 / 61

页数：19

共 50 条

[21] k Nearest Neighbor Classification Coprocessor with Weighted Clock-Mapping-Based Searching
An, Fengwei
Chen, Lei
Akazawa, Toshinobu
Yamasaki, Shogo
Mattausch, Hans Jurgen
IEICE TRANSACTIONS ON ELECTRONICS, 2016, E99C (03): : 397 - 403
[22] Algorithm for searching nearest-neighbor based on the bounded k-d tree
College of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
不详
Huazhong Ligong Daxue Xuebao, 2008, 7 (73-76):
[23] Distance-Constraint k-Nearest Neighbor Searching in Mobile Sensor Networks
Han, Yongkoo
Park, Kisung
Hong, Jihye
Ulamin, Noor
Lee, Young-Koo
SENSORS, 2015, 15 (08) : 18209 - 18228
[24] On k-nearest neighbor searching in non-ordered discrete data spaces
Kolbe, Dashiell
Zhu, Qiang
Pramanik, Sakti
2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 401 - +
[25] Navigating K-Nearest Neighbor Graphs to Solve Nearest Neighbor Searches
Chavez, Edgar
Sadit Tellez, Eric
ADVANCES IN PATTERN RECOGNITION, 2010, 6256 : 270 - 280
[26] Complexity analysis for partitioning nearest neighbor searching algorithms
Zakarauskas, P
Ozard, JM
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (06) : 663 - 668
[27] Hit-directed nearest neighbor searching.
Shanmugasundaram, V
Maggiora, GM
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2004, 227 : U688 - U688
[28] On the Most Likely Voronoi Diagram and Nearest Neighbor Searching
Suri, Subhash
Verbeek, Kevin
ALGORITHMS AND COMPUTATION, ISAAC 2014, 2014, 8889 : 338 - 350
[29] Fuzzy Monotonic K-Nearest Neighbor Versus Monotonic Fuzzy K-Nearest Neighbor
Zhu, Hong
Wang, Xizhao
Wang, Ran
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (09) : 3501 - 3513
[30] Accounting for boundary effects in nearest-neighbor searching
Arya, S
Mount, DM
Narayan, O
DISCRETE & COMPUTATIONAL GEOMETRY, 1996, 16 (02) : 155 - 176

← 1 2 3 4 5 →