Cache-oblivious index for approximate string matching

被引:0
|
作者
Hon, Wing-Kai [1 ]
Lam, Tak-Wah [2 ]
Shah, Rahul
Tam, Siu-Lung [2 ]
Vitter, Jeffrey Scott [3 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[3] Purdue Univ, Dept Comp Sci, Indiana, PA USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((n log(k) n)/B) disk pages and finds all k-error matches with O((vertical bar P vertical bar + occ)/B + log(k) n log log(B) n) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Omega(vertical bar P vertical bar + occ + poly(log n)) I/Os. The second index reduces the space to O((n log n)/B) disk pages, and the I/O complexity is O((vertical bar P vertical bar + occ)/B + log(k(k+1)) n log log n).
引用
收藏
页码:40 / +
页数:3
相关论文
共 50 条
  • [21] Cache-oblivious scanline algorithm design
    Rahman, Md Mizanur
    COMPUTER GRAPHICS, IMAGING AND VISUALISATION: NEW ADVANCES, 2007, : 22 - 27
  • [22] Low Depth Cache-Oblivious Algorithms
    Blelloch, Guy E.
    Gibbons, Phillip B.
    Simhadri, Harsha Vardhan
    SPAA '10: PROCEEDINGS OF THE TWENTY-SECOND ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2010, : 189 - 199
  • [23] On the limits of cache-oblivious matrix transposition
    Silvestri, Francesco
    TRUSTWORTHY GLOBAL COMPUTING, 2007, 4661 : 233 - 243
  • [24] Cache-aware and cache-oblivious adaptive sorting
    Brodal, GS
    Fagerberg, R
    Moruz, G
    AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2005, 3580 : 576 - 588
  • [25] Cache-Oblivious and Data-Oblivious Sorting and Applications
    Chan, T-H. Hubert
    Guo, Yue
    Lin, Wei-Kai
    Shi, Elaine
    SODA'18: PROCEEDINGS OF THE TWENTY-NINTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2018, : 2201 - 2220
  • [26] Cache-oblivious B-trees
    Bender, MA
    Demaine, ED
    Farach-Colton, M
    41ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2000, : 399 - 409
  • [27] Cache-Oblivious Scheduling of Shared Workloads
    Bar, Arian
    Golab, Lukasz
    Ruehrup, Stefan
    Schiavone, Mirko
    Casas, Pedro
    2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 855 - 866
  • [28] Cache-oblivious databases: Limitations and opportunities
    He, Bingsheng
    Luo, Qiong
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (02):
  • [29] Cache-oblivious B-trees
    Bender, MA
    Demaine, ED
    Farach-Colton, M
    SIAM JOURNAL ON COMPUTING, 2005, 35 (02) : 341 - 358
  • [30] On the limits of cache-oblivious rational permutations
    Silvestri, Francesco
    THEORETICAL COMPUTER SCIENCE, 2008, 402 (2-3) : 221 - 233