Efficient Document-at-a-time and Score-at-a-time Query Evaluation for Learned Sparse Representations

被引:9
|
作者
MacKenzie, Joel [1 ]
Trotman, Andrew [2 ]
Lin, Jimmy [3 ]
机构
[1] Univ Queensland, St Lucia, Qld, Australia
[2] Univ Otago, Dept Comp Sci, POB 56, Dunedin, New Zealand
[3] Univ Waterloo, 200 Univ Ave West, Waterloo, ON N2L 3G1, Canada
基金
澳大利亚研究理事会; 加拿大自然科学与工程研究理事会;
关键词
Efficiency; indexing; query processing; learned sparse retrieval; STRATEGIES;
D O I
10.1145/3576922
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Researchers have had much recent success with ranking models based on so-called learned sparse representations generated by transformers. One crucial advantage of this approach is that such models can exploit inverted indexes for top-k retrieval, thereby leveraging decades of work on efficient query evaluation. Yet, there remain many open questions about how these learned representations fit within the existing literature, which our work aims to tackle using four representative learned sparse models. We find that impact weights generated by transformers appear to greatly reduce opportunities for skipping and early exiting optimizations in well-studied document-at-a-time (DAAT) approaches. Similarly, "off-the-shelf" application of score-at-a-time (SAAT) processing exhibits a mismatch between these weights and assumptions behind accumulator management strategies. Building on these observations, we present solutions to address deficiencies with both DAAT and SAAT approaches, yielding substantial speedups in query evaluation. Our detailed empirical analysis demonstrates that both methods lie on the effectiveness-efficiency Pareto frontier, indicating that the optimal choice for deployment depends on operational constraints.
引用
下载
收藏
页数:28
相关论文
共 50 条
  • [1] A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation
    Crane, Matt
    Culpepper, J. Shane
    Lin, Jimmy
    Mackenzie, Joel
    Trotman, Andrew
    WSDM'17: PROCEEDINGS OF THE TENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2017, : 201 - 210
  • [2] A Common Framework for Exploring Document-at-a-Time and Score-at-a-Time Retrieval Methods
    Trotman, Andrew
    Mackenzie, Joel
    Parameswaran, Pradeesh
    Lin, Jimmy
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3229 - 3234
  • [3] The role of index compression in score-at-a-time query evaluation
    Jimmy Lin
    Andrew Trotman
    Information Retrieval Journal, 2017, 20 : 199 - 220
  • [4] The role of index compression in score-at-a-time query evaluation
    Lin, Jimmy
    Trotman, Andrew
    INFORMATION RETRIEVAL JOURNAL, 2017, 20 (03): : 199 - 220
  • [5] Fast Document-at-a-time Query Processing using Two-tier Indexes
    Rossi, Cristian
    de Moura, Edleno Silva
    Carvalho, Andre Luiz
    da Silva, Altigran Soares
    SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, 2013, : 183 - 192
  • [6] Sparse learned kernels for interpretable and efficient medical time series processing
    Chen, Sully F.
    Guo, Zhicheng
    Ding, Cheng
    Hu, Xiao
    Rudin, Cynthia
    NATURE MACHINE INTELLIGENCE, 2024, 6 (10) : 1132 - 1144
  • [7] Performance Evaluation of Query Response Time in The Document Stored NoSQL Database
    Gunawan, Rohmat
    Rahmatulloh, Alam
    Darmawan, Irfan
    2019 16TH INTERNATIONAL CONFERENCE ON QUALITY IN RESEARCH (QIR) / INTERNATIONAL SYMPOSIUM ON ELECTRICAL AND COMPUTER ENGINEERING, 2019, : 156 - 161
  • [8] Sparse time-frequency representations
    Gardner, TJ
    Magnasco, MO
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (16) : 6094 - 6099
  • [9] Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations
    Bruch, Sebastian
    Nardini, Franco Maria
    Rulli, Cosimo
    Venturini, Rossano
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 152 - 162
  • [10] SPARSE TIME FREQUENCY REPRESENTATIONS AND DYNAMICAL SYSTEMS
    Hou, Thomas Y.
    Shi, Zuoqiang
    Tavallali, Peyman
    COMMUNICATIONS IN MATHEMATICAL SCIENCES, 2015, 13 (03) : 673 - 694