Caching Search Engine Results over Incremental Indices

被引:0
|
作者
Blanco, Roi [1 ]
Bortnikov, Edward
Junqueira, Flavio P. [1 ]
Lempel, Ronny
Telloli, Luca
Zaragoza, Hugo [1 ]
机构
[1] Yahoo Res, Barcelona, Spain
关键词
Search engine caching; Real-time indexing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A Web search engine must update its index periodically to incorporate changes to the Web. We argue in this paper that index updates fundamentally impact the design of search engine result caches, a performance-critical component of modern search engines. Index updates lead to the problem of cache invalidation: invalidating cached entries of queries whose results have changed. Naive approaches, such as flushing the entire cache upon every index update, lead to poor performance and in fact, render caching futile when the frequency of updates is high. Solving the invalidation problem efficiently corresponds to predicting accurately which queries will produce different results if re-evaluated, given the actual changes to the index. To obtain this property, we propose a framework for developing invalidation predictors and define metrics to evaluate invalidation schemes. We describe concrete predictors using this framework and compare them against a baseline that uses a cache invalidation scheme based on time-to-live (TTL). Evaluation over Wikipedia documents using a query log from the Yahoo! search engine shows that selective invalidation of cached search results can lower the number of unnecessary query evaluations by as much as 30% compared to a baseline scheme, while returning results of similar freshness. In general, our predictors enable fewer unnecessary invalidations and fewer stale results compared to a TTL-only scheme for similar freshness of results.
引用
收藏
页码:82 / 89
页数:8
相关论文
共 50 条
  • [21] Spatial Variation in Search Engine Results
    Noack, David
    43RD HAWAII INTERNATIONAL CONFERENCE ON SYSTEMS SCIENCES VOLS 1-5 (HICSS 2010), 2010, : 1567 - 1576
  • [22] Sampling Search-Engine Results
    Aris Anagnostopoulos
    Andrei Z. Broder
    David Carmel
    World Wide Web, 2006, 9 : 397 - 429
  • [23] User rankings of search engine results
    Bar-Ilan, Judit
    Keenoy, Kevin
    Yaari, Eti
    Levene, Mark
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (09): : 1254 - 1266
  • [24] Temporal ranking of search engine results
    Jatowt, A
    Kawai, Y
    Tanaka, K
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2005, 2005, 3806 : 43 - 52
  • [25] RANKING CLASSES OF SEARCH ENGINE RESULTS
    Zhu, Zheng
    Levene, Mark
    Cox, Ingemar
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 294 - 301
  • [26] A method to assess search engine results
    Bar-Ilan, Judit
    Levene, Mark
    ONLINE INFORMATION REVIEW, 2011, 35 (06) : 854 - 868
  • [27] LOD search engine: A semantic search over linked data
    Hiteshwar kumar Azad
    Akshay Deepak
    Amisha Azad
    Journal of Intelligent Information Systems, 2022, 59 : 71 - 91
  • [28] Search Engine Visibility Indices Versus Visitor Traffic on Websites
    Haerting, Ralf-Christian
    Mohl, Maik
    Steinhauser, Philipp
    Moehring, Michael
    BUSINESS INFORMATION SYSTEMS (BIS 2016), 2016, 255 : 91 - 101
  • [29] LOD search engine: A semantic search over linked data
    Azad, Hiteshwar Kumar
    Deepak, Akshay
    Azad, Amisha
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2022, 59 (01) : 71 - 91
  • [30] Static caching for incremental computation
    Liu, YHA
    Stoller, SD
    Teitelbaum, T
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1998, 20 (03): : 546 - 585