Cache-oblivious index for approximate string matching

被引:0
|
作者
Hon, Wing-Kai [1 ]
Lam, Tak-Wah [2 ]
Shah, Rahul
Tam, Siu-Lung [2 ]
Vitter, Jeffrey Scott [3 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[3] Purdue Univ, Dept Comp Sci, Indiana, PA USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((n log(k) n)/B) disk pages and finds all k-error matches with O((vertical bar P vertical bar + occ)/B + log(k) n log log(B) n) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Omega(vertical bar P vertical bar + occ + poly(log n)) I/Os. The second index reduces the space to O((n log n)/B) disk pages, and the I/O complexity is O((vertical bar P vertical bar + occ)/B + log(k(k+1)) n log log n).
引用
收藏
页码:40 / +
页数:3
相关论文
共 50 条
  • [1] Cache-oblivious index for approximate string matching
    Hon, Wing-Kai
    Lam, Tak-Wah
    Shah, Rahul
    Tam, Siu-Lung
    Vitter, Jeffrey Scott
    THEORETICAL COMPUTER SCIENCE, 2011, 412 (29) : 3579 - 3588
  • [2] Cache-Oblivious String Dictionaries
    Brodal, Gerth Stolting
    Fagerberg, Rolf
    PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 581 - +
  • [3] External string sorting: Faster and cache-oblivious
    Fagerberg, R
    Pagh, A
    Pagh, R
    STACS 2006, PROCEEDINGS, 2006, 3884 : 68 - 79
  • [4] Compressed Cache-Oblivious String B-tree
    Ferragina, Paolo
    Venturini, Rossano
    ALGORITHMS - ESA 2013, 2013, 8125 : 469 - 480
  • [5] Compressed Cache-Oblivious String B-Tree
    Ferragina, Paolo
    Venturini, Rossano
    ACM TRANSACTIONS ON ALGORITHMS, 2016, 12 (04)
  • [6] Cache-Oblivious Hashing
    Rasmus Pagh
    Zhewei Wei
    Ke Yi
    Qin Zhang
    Algorithmica, 2014, 69 : 864 - 883
  • [7] Cache-Oblivious Hashing
    Pagh, Rasmus
    Wei, Zhewei
    Yi, Ke
    Zhang, Qin
    PODS 2010: PROCEEDINGS OF THE TWENTY-NINTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2010, : 297 - 304
  • [8] Cache-Oblivious Persistence
    Davoodi, Pooya
    Fineman, Jeremy T.
    Iacono, John
    Oezkan, Oezguer
    ALGORITHMS - ESA 2014, 2014, 8737 : 296 - 308
  • [9] Cache-oblivious algorithms
    Leiserson, CE
    ALGORITHMS AND COMPLEXITY, PROCEEDINGS, 2003, 2653 : 5 - 5
  • [10] Cache-Oblivious Algorithms
    Frigo, Matteo
    Leiserson, Charles E.
    Prokop, Harald
    Ramachandran, Sridhar
    ACM TRANSACTIONS ON ALGORITHMS, 2012, 8 (01)