Faster Compressed Suffix Trees for Repetitive Text Collections

被引:0
|
作者
Navarro, Gonzalo [1 ]
Ordonez, Alberto [2 ]
机构
[1] Univ Chile, Dept Comp Sci, Santiago, Chile
[2] Univ A Coruna, Lab Bases Datos, Coruna, Spain
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent compressed suffix trees targeted to highly repetitive text collections reach excellent compression performance, but operation times in the order of milliseconds. We design a new suffix tree representation for this scenario that still achieves very low space usage, only slightly larger than the best previous one, but supports the operations within microseconds. This puts the data structure in the same performance level of compressed suffix trees designed for standard text collections, which on repetitive collections use many times more space than our new structure.
引用
收藏
页码:424 / 435
页数:12
相关论文
共 50 条
  • [21] Dynamic Dictionary Matching and Compressed Suffix Trees
    Chan, Ho-Leung
    Hon, Wing-Kai
    Lam, Tak-Wah
    Sadakane, Kunihiko
    PROCEEDINGS OF THE SIXTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2005, : 13 - 22
  • [22] Activity Discovery Using Compressed Suffix Trees
    Guha, Prithwijit
    Mukerjee, Amitabha
    Venkatesh, K. S.
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2011, PT II, 2011, 6979 (II): : 69 - +
  • [23] Space-efficient construction of compressed suffix trees
    Prezza, Nicola
    Rosone, Giovanna
    THEORETICAL COMPUTER SCIENCE, 2021, 852 : 138 - 156
  • [24] New text indexing functionalities of the compressed suffix arrays
    Sadakane, K
    JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2003, 48 (02): : 294 - 313
  • [25] Compressed Indexes for Dynamic Text Collections
    Chan, Ho-Leung
    Hon, Wing-Kai
    Lam, Tak-Wah
    Sadakane, Kunihiko
    ACM TRANSACTIONS ON ALGORITHMS, 2007, 3 (02)
  • [26] CHICO: A Compressed Hybrid Index for Repetitive Collections
    Valenzuela, Daniel
    EXPERIMENTAL ALGORITHMS, SEA 2016, 2016, 9685 : 326 - 338
  • [27] ANNOTATED SUFFIX TREE AS A WAY OF TEXT REPRESENTATION FOR INFORMATION RETRIEVAL IN TEXT COLLECTIONS
    Frolov, Dmitry S.
    BIZNES INFORMATIKA-BUSINESS INFORMATICS, 2015, 34 (04): : 63 - 70
  • [28] Compressed text databases with efficient query algorithms based on the compressed suffix array
    Sadakane, K
    ALGORITHM AND COMPUTATION, PROCEEDINGS, 2001, 1969 : 410 - 421
  • [29] Hierarchical clustering of text corpora using suffix trees
    Maslowska, I
    Slowinski, R
    INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2003, : 179 - 188
  • [30] Dotted suffix trees - A structure for approximate text indexing
    Coelho, Luis Pedro
    Oliveira, Arlindo L.
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2006, 4209 : 329 - 336