A Learning-to-Rank Formulation of Clustering-Based Approximate Nearest Neighbor Search

被引：1

作者：

Vecchiato, Thomas ^{[1
]}

Lucchese, Claudio ^{[1
]}

Nardini, Franco Maria ^{[2
]}

Bruch, Sebastian ^{[3
]}

机构：

[1] Ca Foscari Univ Venice, Venice, Italy

[2] ISTI CNR, Pisa, Italy

[3] Pinecone, New York, NY USA

来源：

PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024 | 2024年

关键词：

Approximate Nearest Neighbor Search; Inverted File; Learning to; Rank; EFFICIENT;

D O I：

10.1145/3626772.3657931

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A critical piece of the modern information retrieval puzzle is approximate nearest neighbor search. Its objective is to return a set of k . data points that are closest to a query point, with its accuracy measured by the proportion of exact nearest neighbors captured in the returned set. One popular approach to this question is clustering: The indexing algorithm partitions data points into non-overlapping subsets and represents each partition by a point such as its centroid. The query processing algorithm first identifies the nearest clusters-a process known as routing-then performs a nearest neighbor search over those clusters only. In this work, we make a simple observation: The routing function solves a ranking problem. Its quality can therefore be assessed with a ranking metric, making the function amenable to learning-to-rank. Interestingly, ground-truth is often freely available: Given a query distribution in a top-k. configuration, the ground-truth is the set of clusters that contain the exact top-k. vectors. We develop this insight and apply it to Maximum Inner Product Search (MIPS). As we demonstrate empirically on various datasets, learning a simple linear function consistently improves the accuracy of clustering-based MIPS.

引用

页码：2261 / 2265

页数：5

共 50 条

[1] Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search
Munoz, Javier Vargas
Goncalves, Marcos A.
Dias, Zanoni
Torres, Ricardo da S.
PATTERN RECOGNITION, 2019, 96
[2] Clustering-Based Transductive Semi-Supervised Learning for Learning-to-Rank
Rahangdale, Ashwini
Raut, Shital
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (12)
[3] Clustering-based Nearest Neighbor Searching
Ling, Ping
Rong, Xiangsheng
Dong, Yongquan
JOURNAL OF COMPUTERS, 2013, 8 (08) : 2085 - 2092
[4] Local Deep Learning Quantization for Approximate Nearest Neighbor Search
Li, Quan
Xie, Xike
Wang, Chao
Weng, Jiali
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1125 - 1129
[5] Projection Search For Approximate Nearest Neighbor
Feng, Cheng
Yang, Bo
2016 17TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2016, : 33 - 38
[6] Hardness of Approximate Nearest Neighbor Search
Rubinstein, Aviad
STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2018, : 1260 - 1268
[7] Clustering-based reference set reduction for k-nearest neighbor
Hwang, Seongseob
Cho, Sungzoon
ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 2, PROCEEDINGS, 2007, 4492 : 880 - +
[8] Approximate Nearest Neighbor Search on Standard Search Engines
Carrara, Fabio
Vadicamo, Lucia
Gennaro, Claudio
Amato, Giuseppe
SIMILARITY SEARCH AND APPLICATIONS (SISAP 2022), 2022, 13590 : 214 - 221
[9] Learning Adaptive Hypersphere: Boosting Efficiency on Approximate Nearest Neighbor Search
Ai, Liefu
Jiang, Changyu
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2190 - 2194
[10] Accumulative Quantization for Approximate Nearest Neighbor Search
Ai, Liefu
Tao, Yong
Cheng, Hongjun
Wang, Yuanzhi
Xie, Shaoguo
Liu, Deyang
Zheng, Xin
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022

← 1 2 3 4 5 →