Sparse LCS Common Substring Alignment

被引:0
|
作者
Landau, GM
Schieber, B
Ziv-Ukelson, M
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] Univ Haifa, Dept Comp Sci, IL-31905 Haifa, Israel
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The "Common Substring Alignment". problem is defined as follows. The input consists of a set of strings S-1, S-2...S-c, with a common substring appearing at least once in each of them, and a target string T. The goal is to compute similarity of all strings S-i with T, without computing the part of the common substring over and over again. In this paper we consider the Common Substring Alignment problem for the LCS (Longest Common Subsequence) similarity metric. Our algorithm gains its efficiency by exploiting the sparsity inherent to the LCS problem. Let Y be the common substring, n be the size of the compared sequences, L-y be the length of the LCS of T and Y, denoted \LCS[T,Y]\, and L be max {\LCS[T,S-i]\}. Our algorithm consists of an O(nL(y)) time encoding stage that is executed once per common substring, and an O(L) time alignment stage that is executed once for each appearance of the common substring in each source string. The. additional running time depends only on the length of the parts of the strings that are not in any common substring.
引用
收藏
页码:225 / 236
页数:12
相关论文
共 50 条
  • [21] Computing the longest common substring with one mismatch
    M. A. Babenko
    T. A. Starikovskaya
    Problems of Information Transmission, 2011, 47 : 28 - 33
  • [22] The average common substring approach to phylogenomic reconstruction
    Ulitsky, I
    Burstein, D
    Tuller, T
    Chor, B
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (02) : 336 - 350
  • [23] Computing the longest common substring with one mismatch
    Babenko, M. A.
    Starikovskaya, T. A.
    PROBLEMS OF INFORMATION TRANSMISSION, 2011, 47 (01) : 28 - 33
  • [24] Alignment of nematic LCs on surface of amorphous hydrogenated carbon
    Konshina, EA
    SURFACE PHENOMENA - INTERNATIONAL LIQUID CRYSTAL WORKSHOP, 1996, 2731 : 20 - 24
  • [25] A Linear-Space Algorithm for the Substring Constrained Alignment Problem
    Sakai, Yoshifumi
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2016, 2016, 9954 : 15 - 21
  • [26] Correction to: Longest Common Substring with Approximately k Mismatches
    Tomasz Kociumaka
    Jakub Radoszewski
    Tatiana Starikovskaya
    Algorithmica, 2019, 81 : 3074 - 3074
  • [27] Nucleotide Sequence Alignment and Compression via Shortest Unique Substring
    Adas, Boran
    Bayraktar, Ersin
    Faro, Simone
    Moustafa, Ibraheem Elsayed
    Kulekci, M. Oguzhan
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2015), PT II, 2015, 9044 : 363 - 374
  • [28] A parallel algorithm for longest common substring of multiple biosequences
    Liu, Wei
    Chen, Ling
    DCABES 2006 PROCEEDINGS, VOLS 1 AND 2, 2006, : 13 - 17
  • [29] Sublinear Space Algorithms for the Longest Common Substring Problem
    Kociumaka, Tomasz
    Starikovskaya, Tatiana
    Vildhoj, Hjalte Wedel
    ALGORITHMS - ESA 2014, 2014, 8737 : 605 - 617
  • [30] Longest common substring for random subshifts of finite type
    Rousseau, Jerome
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2021, 57 (03): : 1768 - 1785