Computing Maximal Covers for Protein Sequences

被引:1
|
作者
Golding, G. Brian [1 ]
Koponen, Holly [2 ]
Mhaskar, Neerja [2 ,3 ]
Smyth, W. F. [2 ]
机构
[1] McMaster Univ, Dept Biol, Hamilton, ON, Canada
[2] McMaster Univ, Dept Comp & Software, Hamilton, ON, Canada
[3] McMaster Univ, Dept Comp & Software, 1280 Main St West, Hamilton, ON L8S 4L8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
MAXCOVER; MUMmer; protein; repeats; string covers; ARRAY;
D O I
10.1089/cmb.2021.0520
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A partial cover of a string or sequence of length n, which we model as an array x=x[1..n], is a repeating substring u of x such that "many " positions in x lie within occurrences of u. A maximal cover u*-introduced in 2018 by Mhaskar and Smyth as optimal cover-is a partial cover that, over all partial covers u, maximizes the positions covered. Applying data structures also introduced by Mhaskar and Smyth, our software MAXCOVER for the first time enables efficient computation of u* for any x-in particular, as described here, for protein sequences of Arabidopsis, Caenorhabditis elegans, Drosophila melanogaster, and humans. In this protein context, we also compare an extended version of MAXCOVER with existing software (MUMmer's repeat-match) for the closely related task of computing non-extendible repeating substrings (a.k.a. maximal repeats). In practice, MAXCOVER is an order-of-magnitude faster than MUMmer, with much lower space requirements, while producing more compact output that, nevertheless, yields a more exact and user-friendly specification of the repeats.
引用
收藏
页码:149 / 160
页数:12
相关论文
共 50 条
  • [1] Faster Algorithms for Computing Maximal Multirepeats in Multiple Sequences
    Iliopoulos, Costas S.
    Smyth, W. F.
    Yusufu, Munina
    FUNDAMENTA INFORMATICAE, 2009, 97 (03) : 311 - 320
  • [2] On identifying maximal covers
    Muñoz, S
    SIAM JOURNAL ON DISCRETE MATHEMATICS, 2005, 18 (04) : 749 - 768
  • [3] Maximal covers of finite groups
    Bastos, Raimundo
    Lima, Igor
    Rogerio, Jose R.
    COMMUNICATIONS IN ALGEBRA, 2020, 48 (02) : 691 - 701
  • [4] Computing the λ-covers of a string
    Guo, Qing
    Zhang, Hui
    Iliopoulos, Costas S.
    INFORMATION SCIENCES, 2007, 177 (19) : 3957 - 3967
  • [5] Maximal Independent Sets and Separating Covers
    Vatter, Vincent
    AMERICAN MATHEMATICAL MONTHLY, 2011, 118 (05): : 418 - 423
  • [6] Maximal covers of chains of prime ideals
    Sarussi S.
    Beiträge zur Algebra und Geometrie / Contributions to Algebra and Geometry, 2017, 58 (3): : 483 - 498
  • [7] Computing covers of Lie algebras
    Ellis, Graham
    Mohammadzadeh, Hamid
    Tavallaee, Hamid
    COMPUTATIONAL GROUP THEORY AND THE THEORY OF GROUPS, II, 2010, 511 : 25 - +
  • [8] Computing minimal Gorenstein covers
    Elias, Juan
    Homs, Roser
    Mourrain, Bernard
    JOURNAL OF PURE AND APPLIED ALGEBRA, 2020, 224 (07)
  • [9] Epimorphisms and maximal covers in categories of compact spaces
    Banasciiewski, B.
    Hager, A. W.
    APPLIED GENERAL TOPOLOGY, 2013, 14 (01): : 41 - 52
  • [10] Minimal Covers of Maximal Cliques for Interval Graphs
    Vandal, Alain C.
    Conder, Marston D. E.
    Gentleman, Robert
    ARS COMBINATORIA, 2009, 92 : 97 - 129