A fast algorithm for constructing suffix arrays for fixed-size alphabets

被引:0
|
作者
Kim, DK [1 ]
Jo, J
Park, H
机构
[1] Pusan Natl Univ, Sch Elect & Comp Engn, Pusan 609735, South Korea
[2] Hanyang Univ, Coll Informat & Commun, Seoul, South Korea
来源
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The suffix array of a string T is basically a sorted list of all the suffixes of T. Suffix arrays have been fundamental index data structures in computational biology. If we are to search a DNA sequence in a genome sequence, we construct the suffix array for the genome sequence and then search the DNA sequence in the suffix array. In this paper, we consider the construction of the suffix array of T of length n where the size of the alphabet is fixed. It has been well-known that one can construct the suffix array of T in O(n) time by constructing suffix tree of T and traversing the suffix tree. Although this approach takes O(n) time, it is not appropriate for practical use because it uses a lot of spaces and it is complicated to implement. Recently, almost at the same time, several algorithms have been developed to directly construct suffix arrays in O(n) time. However, these algorithms are developed for integer alphabets and thus do not exploit the properties given when the size of the alphabet is fixed. We present a fast algorithm for constructing suffix arrays for the fixed-size alphabet. Our algorithm constructs suffix arrays faster than any other algorithms developed for integer or general alphabets when the size of the alphabet is fixed. For example, we reduced the time required for constructing suffix arrays for DNA sequences by 25%-38%. In addition, we do not sacrifice the space to improve the running time. The space required by our algorithm is almost equal to or even less than those required by previous fast algorithms.
引用
收藏
页码:301 / 314
页数:14
相关论文
共 50 条
  • [1] A fast algorithm for constructing suffix arrays for DNA alphabets
    Rabea, Zeinab
    El-Metwally, Sara
    Elmougy, Samir
    Zakaria, Magdi
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (07) : 4659 - 4668
  • [2] Constructing compressed suffix arrays with large alphabets
    Hon, WK
    Lam, TW
    Sadakane, K
    Sung, WK
    [J]. ALGORITHMS AND COMPUTATION, PROCEEDINGS, 2003, 2906 : 240 - 249
  • [3] AN ALGORITHM FOR FUNCTIONAL RECONFIGURATION OF FIXED-SIZE ARRAYS
    LOMBARDI, F
    SCIUTO, D
    STEFANELLI, R
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 1988, 7 (10) : 1114 - 1118
  • [4] REGULAR PARTITIONING FOR SYNTHESIZING FIXED-SIZE SYSTOLIC ARRAYS
    DARTE, A
    [J]. INTEGRATION-THE VLSI JOURNAL, 1991, 12 (03) : 293 - 304
  • [5] RECONFIGURABLE SYSTOLIC ARRAYS WITH FIXED-SIZE AND STRUCTURE DEGRADATION
    KHARCHENKO, VS
    LITVINENKO, VG
    KRASNOBAEV, VA
    [J]. CYBERNETICS AND SYSTEMS ANALYSIS, 1992, 28 (04) : 623 - 631
  • [6] UTILIZING FIXED-SIZE SYSTOLIC ARRAYS FOR LARGE COMPUTATIONAL PROBLEMS
    PETKOV, N
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1989, 399 : 132 - 172
  • [7] UTILIZING FIXED-SIZE SYSTOLIC ARRAYS FOR LARGE COMPUTATIONAL PROBLEMS
    PETKOV, N
    [J]. RECENT ISSUES IN PATTERN ANALYSIS AND RECOGNITION, 1989, 399 : 132 - 172
  • [8] An efficient index data structure with the capabilities of suffix fees and suffix arrays for alphabets of non-negligible size
    Kim, DK
    Jeon, JE
    Park, H
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2004, 3246 : 138 - 149
  • [9] A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays
    Wing-Kai Hon
    Tak-Wah Lam
    Kunihiko Sadakane
    Wing-Kin Sung
    Siu-Ming Yiu
    [J]. Algorithmica, 2007, 48 : 23 - 36
  • [10] A space and time efficient algorithm for constructing compressed suffix arrays
    Hon, Wing-Kai
    Lam, Tak-Wah
    Sadakane, Kunihiko
    Sung, Wing-Kin
    Yiu, Siu-Ming
    [J]. ALGORITHMICA, 2007, 48 (01) : 23 - 36