Lempel-Ziv Factorization Using Less Time & Space

被引:22
|
作者
Chen, Gang [1 ]
Puglisi, Simon J. [2 ]
Smyth, W. F. [3 ]
机构
[1] McMaster Univ, Dept Comp & Software, Hamilton, ON L8S 4K1, Canada
[2] RMIT Univ, Sch Comp Sci & Informat Technol, Melbourne, Vic 3001, Australia
[3] Curtin Univ Technol, Digital Ecosyst & Business Intelligence Inst, Perth, WA 6845, Australia
关键词
Lempel-Ziv factorization; suffix array; suffix tree; LZ factorization;
D O I
10.1007/s11786-007-0024-4
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
For 30 years the Lempel-Ziv factorization LZx of a string x = x[ 1.. n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZx was based on Theta(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree STx of x. Recently Abouelhoda et al. proposed an efficient Lempel-Ziv factorization algorithm based on an "enhanced" suffix array that is, a suffix array SAx together with supporting data structures, principally an "interval tree". In this paper we introduce a collection of fast spaceefficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true T(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether.
引用
收藏
页码:605 / 623
页数:19
相关论文
共 50 条
  • [21] Lempel-Ziv Factorization May Be Harder Than Computing All Runs
    Kosolobov, Dmitry
    [J]. 32ND INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2015), 2015, 30 : 582 - 593
  • [22] Time-space trade-offs for Lempel-Ziv compressed indexing
    Bille, Philip
    Ettienne, Mikko Berggren
    Gortz, Inge Li
    Vildhoj, Hjalte Wedel
    [J]. THEORETICAL COMPUTER SCIENCE, 2018, 713 : 66 - 77
  • [23] Application of Lempel-Ziv factorization to the approximation of grammar-based compression
    Rytter, W
    [J]. THEORETICAL COMPUTER SCIENCE, 2003, 302 (1-3) : 211 - 222
  • [24] On Lempel-Ziv complexity of sequences
    Doganaksoy, Ali
    Gologlu, Faruk
    [J]. SEQUENCES AND THEIR APPLICATIONS - SETA 2006, 2006, 4086 : 180 - 189
  • [25] Pushdown and Lempel-Ziv depth
    Jordon, Liam
    Moser, Philippe
    [J]. INFORMATION AND COMPUTATION, 2023, 292
  • [26] Polylog Space Compression Is Incomparable with Lempel-Ziv and Pushdown Compression
    Mayordomo, Elvira
    Moser, Philippe
    [J]. SOFSEM 2009-THEORY AND PRACTICE OF COMPUTER SCIENCE, PROCEEDINGS, 2009, 5404 : 633 - +
  • [27] Polylog Space Compression, Pushdown Compression, and Lempel-Ziv Are Incomparable
    Elvira Mayordomo
    Philippe Moser
    Sylvain Perifel
    [J]. Theory of Computing Systems, 2011, 48 : 731 - 766
  • [28] Lempel-Ziv Computation In Compressed Space (LZ-CICS)
    Koeppl, Dominik
    Sadakane, Kunihiko
    [J]. 2016 DATA COMPRESSION CONFERENCE (DCC), 2016, : 3 - 12
  • [29] Polylog Space Compression, Pushdown Compression, and Lempel-Ziv Are Incomparable
    Mayordomo, Elvira
    Moser, Philippe
    Perifel, Sylvain
    [J]. THEORY OF COMPUTING SYSTEMS, 2011, 48 (04) : 731 - 766
  • [30] Travel time reliability measure based on predictability using the Lempel-Ziv algorithm
    Li, Huiping
    He, Fang
    Lin, Xi
    Wang, Yinhai
    Li, Meng
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2019, 101 : 161 - 180