Graph Reordering for Cache-Efficient Near Neighbor Search

被引:0
|
作者
Coleman, Benjamin [1 ,2 ]
Segarra, Santiago [1 ]
Smola, Alex [2 ]
Shrivastava, Anshumali [3 ]
机构
[1] Rice Univ, ECE Dept, Houston, TX 77005 USA
[2] Amazon Web Serv, Seattle, WA USA
[3] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph search is one of the most successful algorithmic trends in near neighbor search. Several of the most popular and empirically successful algorithms are, at their core, a greedy walk along a pruned near neighbor graph. However, graph traversal applications often suffer from poor memory access patterns, and near neighbor search is no exception to this rule. Our measurements show that popular search indices such as the hierarchical navigable small-world graph (HNSW) can have poor cache miss performance. To address this issue, we formulate the graph traversal problem as a cache hit maximization task and propose multiple graph reordering as a solution. Graph reordering is a memory layout optimization that groups commonly-accessed nodes together in memory. We mathematically formalize the connection between the graph layout and the cache complexity of search. We present exhaustive experiments applying several reordering algorithms to a leading graph-based near neighbor method based on the HNSW index. We find that reordering improves the query time by up to 40%, we present analysis and improvements for existing graph layout methods, and we demonstrate that the time needed to reorder the graph is negligible compared to the time required to construct the index.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Cache-efficient memory layout of aggregate data structures
    Panda, PR
    Semeria, L
    de Micheli, G
    ISSS'01: 14TH INTERNATIONAL SYMPOSIUM ON SYSTEM SYNTHESIS, 2001, : 101 - 106
  • [22] Efficient Parallel Random Sampling-Vectorized, Cache-Efficient, and Online
    Sanders, Peter
    Lamm, Sebastian
    Huebschle-Schneider, Lorenz
    Schrade, Emanuel
    Dachsbacher, Carsten
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2018, 44 (03):
  • [23] Cache-Oblivious Buffer Heap and Cache-Efficient Computation of Shortest Paths in Graphs
    Chowdhury, Rezaul A.
    Ramachandran, Vijaya
    ACM TRANSACTIONS ON ALGORITHMS, 2018, 14 (01)
  • [24] Efficient prototype reordering in nearest neighbor classification
    Bandyopadhyay, S
    Maulik, U
    PATTERN RECOGNITION, 2002, 35 (12) : 2791 - 2799
  • [25] Aligned Scheduling: Cache-Efficient Instruction Scheduling for VLIW Processors
    Porpodas, Vasileios
    Cintra, Marcelo
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2013, 2014, 8664 : 275 - 291
  • [26] Cache-efficient implementation of FIR filters using the Blackfin microcomputer
    Zoican, Sorin
    TELSIKS 2007: 8TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS IN MODERN SATELLITE, CABLE AND BROADCASTING SERVICES, VOLS 1 AND 2, 2007, : 461 - 464
  • [27] Cache-Efficient Approach for Index-Free Personalized PageRank
    Tsuchida, Kohei
    Matsumoto, Naoki
    Shin, Andrew
    Kaneko, Kunitake
    IEEE ACCESS, 2023, 11 : 6944 - 6957
  • [28] Cache-Efficient Fork-Processing Patterns on Large Graphs
    Lu, Shengliang
    Sun, Shixuan
    Paul, Johns
    Li, Yuchen
    He, Bingsheng
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 1208 - 1221
  • [29] Parallel cache-efficient code for computing the McCaskill partition functions
    Palkowski, Marek
    Bielecki, Wlodzimierz
    PROCEEDINGS OF THE 2019 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2019, : 207 - 210
  • [30] Movi: A fast and cache-efficient full-text pangenome index
    Zakeri, Mohsen
    Brown, Nathaniel K.
    Ahmed, Omar Y.
    Gagie, Travis
    Langmead, Ben
    ISCIENCE, 2024, 27 (12)