Cache-oblivious index for approximate string matching

被引:0
|
作者
Hon, Wing-Kai [1 ]
Lam, Tak-Wah [2 ]
Shah, Rahul
Tam, Siu-Lung [2 ]
Vitter, Jeffrey Scott [3 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[3] Purdue Univ, Dept Comp Sci, Indiana, PA USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((n log(k) n)/B) disk pages and finds all k-error matches with O((vertical bar P vertical bar + occ)/B + log(k) n log log(B) n) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Omega(vertical bar P vertical bar + occ + poly(log n)) I/Os. The second index reduces the space to O((n log n)/B) disk pages, and the I/O complexity is O((vertical bar P vertical bar + occ)/B + log(k(k+1)) n log log n).
引用
收藏
页码:40 / +
页数:3
相关论文
共 50 条
  • [31] Cache-Oblivious R-Trees
    Arge, Lars
    de Berg, Mark
    Haverkort, Herman
    ALGORITHMICA, 2009, 53 (01) : 50 - 68
  • [32] Cache-oblivious planar shortest paths
    Jampala, H
    Zeh, N
    AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2005, 3580 : 563 - 575
  • [33] Optimal cache-oblivious implicit dictionaries
    Franceschini, G
    Grossi, R
    AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2003, 2719 : 316 - 331
  • [34] Cache-oblivious algorithms and data structures
    Brodal, GS
    ALGORITHM THEORY- SWAT 2004, 2004, 3111 : 3 - 13
  • [35] Optimal Cache-Oblivious Mesh Layouts
    Michael A. Bender
    Bradley C. Kuszmaul
    Shang-Hua Teng
    Kebin Wang
    Theory of Computing Systems, 2011, 48 : 269 - 296
  • [36] Cache-Oblivious R-Trees
    Lars Arge
    Mark de Berg
    Herman Haverkort
    Algorithmica, 2009, 53 : 50 - 68
  • [37] Cache-Oblivious Dynamic Programming for Bioinformatics
    Chowdhury, Rezaul Alam
    Le, Hai-Son
    Ramachandran, Vijaya
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2010, 7 (03) : 495 - 510
  • [38] Optimal Cache-Oblivious Mesh Layouts
    Bender, Michael A.
    Kuszmaul, Bradley C.
    Teng, Shang-Hua
    Wang, Kebin
    THEORY OF COMPUTING SYSTEMS, 2011, 48 (02) : 269 - 296
  • [39] Cache-Oblivious Peeling of Random Hypergraphs
    Belazzouguil, Djamal
    Boldi, Paolo
    Ottaviano, Giuseppe
    Venturini, Rossano
    Vigna, Sebastiano
    2014 DATA COMPRESSION CONFERENCE (DCC 2014), 2014, : 352 - 361
  • [40] Cache Complexity of Cache-Oblivious Approaches: A Review and Extension
    Abuqaddom, Inas
    Serhan, Sami
    Mahafzah, Basel A.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 1002 - 1009