Top-k Document Retrieval in Compact Space and Near-Optimal Time

被引:0
|
作者
Navarro, Gonzalo [1 ,2 ]
Thankachan, Sharma V. [1 ]
机构
[1] Univ Chile, Dept Comp Sci, Santiago, Chile
[2] Univ Diego Portales, Escuela Informat Telecommun, Santiago, Chile
来源
ALGORITHMS AND COMPUTATION | 2013年 / 8283卷
基金
美国国家科学基金会;
关键词
EFFICIENT ALGORITHMS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Let D={d(1), d(2),... d(D)} be a given set of D string documents of total length n. Our task is to index D such that the k most relevant documents for an online query pattern P of length p can be retrieved efficiently. There exist linear space data structures of O(n) words for answering such queries in optimal O(p+k) time. In this paper, we describe a compact index of size |CSA| + n lg D + o(n lg D) bits with near optimal time, O(p + k lg* n), for the basic relevance metric term-frequency, where |CSA| is the size (in bits) of a compressed full-text index of D, and lg* n is the iterated logarithm of n.
引用
收藏
页码:394 / 404
页数:11
相关论文
共 50 条
  • [1] Top-k document retrieval in optimal space
    Tsur, Dekel
    [J]. INFORMATION PROCESSING LETTERS, 2013, 113 (12) : 440 - 443
  • [2] Faster Top-k Document Retrieval in Optimal Space
    Navarro, Gonzalo
    Thankachan, Sharma V.
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL (SPIRE 2013), 2013, 8214 : 255 - 262
  • [3] TIME-OPTIMAL TOP-k DOCUMENT RETRIEVAL
    Navarro, Gonzalo
    Nekrich, Yakov
    [J]. SIAM JOURNAL ON COMPUTING, 2017, 46 (01) : 80 - 113
  • [4] Faster Compact Top-k Document Retrieval
    Konow, Roberto
    Navarro, Gonzalo
    [J]. 2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 351 - 360
  • [5] New space/time tradeoffs for top-k document retrieval on sequences
    Navarro, Gonzalo
    Thankachan, Sharma V.
    [J]. THEORETICAL COMPUTER SCIENCE, 2014, 542 : 83 - 97
  • [6] A cost-effective approach for mining near-optimal top-k patterns
    Wang, Xin
    Lan, Zhuo
    He, Yu-Ang
    Wang, Yang
    Liu, Zhi-Gui
    Xie, Wen-Bo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 202
  • [7] Top-K Color Queries for Document Retrieval
    Karpinski, Marek
    Nekrich, Yakov
    [J]. PROCEEDINGS OF THE TWENTY-SECOND ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2011, : 401 - 411
  • [8] Faster Compressed Top-k Document Retrieval
    Hon, Wing-Kai
    Shah, Rahul
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    [J]. 2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 341 - 350
  • [9] Top-k Document Retrieval in External Memory
    Shah, Rahul
    Sheng, Cheng
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    [J]. ALGORITHMS - ESA 2013, 2013, 8125 : 803 - 814
  • [10] Efficient In-Memory Top-k Document Retrieval
    Culpepper, J. Shane
    Petri, Matthias
    Scholer, Falk
    [J]. SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 225 - 234