A Learning-to-Rank Formulation of Clustering-Based Approximate Nearest Neighbor Search

被引:1
|
作者
Vecchiato, Thomas [1 ]
Lucchese, Claudio [1 ]
Nardini, Franco Maria [2 ]
Bruch, Sebastian [3 ]
机构
[1] Ca Foscari Univ Venice, Venice, Italy
[2] ISTI CNR, Pisa, Italy
[3] Pinecone, New York, NY USA
来源
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024 | 2024年
关键词
Approximate Nearest Neighbor Search; Inverted File; Learning to; Rank; EFFICIENT;
D O I
10.1145/3626772.3657931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A critical piece of the modern information retrieval puzzle is approximate nearest neighbor search. Its objective is to return a set of k . data points that are closest to a query point, with its accuracy measured by the proportion of exact nearest neighbors captured in the returned set. One popular approach to this question is clustering: The indexing algorithm partitions data points into non-overlapping subsets and represents each partition by a point such as its centroid. The query processing algorithm first identifies the nearest clusters-a process known as routing-then performs a nearest neighbor search over those clusters only. In this work, we make a simple observation: The routing function solves a ranking problem. Its quality can therefore be assessed with a ranking metric, making the function amenable to learning-to-rank. Interestingly, ground-truth is often freely available: Given a query distribution in a top-k. configuration, the ground-truth is the set of clusters that contain the exact top-k. vectors. We develop this insight and apply it to Maximum Inner Product Search (MIPS). As we demonstrate empirically on various datasets, learning a simple linear function consistently improves the accuracy of clustering-based MIPS.
引用
收藏
页码:2261 / 2265
页数:5
相关论文
共 50 条
  • [1] Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search
    Munoz, Javier Vargas
    Goncalves, Marcos A.
    Dias, Zanoni
    Torres, Ricardo da S.
    PATTERN RECOGNITION, 2019, 96
  • [2] Clustering-Based Transductive Semi-Supervised Learning for Learning-to-Rank
    Rahangdale, Ashwini
    Raut, Shital
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (12)
  • [3] Clustering-based Nearest Neighbor Searching
    Ling, Ping
    Rong, Xiangsheng
    Dong, Yongquan
    JOURNAL OF COMPUTERS, 2013, 8 (08) : 2085 - 2092
  • [4] Local Deep Learning Quantization for Approximate Nearest Neighbor Search
    Li, Quan
    Xie, Xike
    Wang, Chao
    Weng, Jiali
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1125 - 1129
  • [5] Projection Search For Approximate Nearest Neighbor
    Feng, Cheng
    Yang, Bo
    2016 17TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2016, : 33 - 38
  • [6] Hardness of Approximate Nearest Neighbor Search
    Rubinstein, Aviad
    STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2018, : 1260 - 1268
  • [7] Clustering-based reference set reduction for k-nearest neighbor
    Hwang, Seongseob
    Cho, Sungzoon
    ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 2, PROCEEDINGS, 2007, 4492 : 880 - +
  • [8] Approximate Nearest Neighbor Search on Standard Search Engines
    Carrara, Fabio
    Vadicamo, Lucia
    Gennaro, Claudio
    Amato, Giuseppe
    SIMILARITY SEARCH AND APPLICATIONS (SISAP 2022), 2022, 13590 : 214 - 221
  • [9] Learning Adaptive Hypersphere: Boosting Efficiency on Approximate Nearest Neighbor Search
    Ai, Liefu
    Jiang, Changyu
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2190 - 2194
  • [10] Accumulative Quantization for Approximate Nearest Neighbor Search
    Ai, Liefu
    Tao, Yong
    Cheng, Hongjun
    Wang, Yuanzhi
    Xie, Shaoguo
    Liu, Deyang
    Zheng, Xin
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022