Computing Maximal Covers for Protein Sequences

被引:1
|
作者
Golding, G. Brian [1 ]
Koponen, Holly [2 ]
Mhaskar, Neerja [2 ,3 ]
Smyth, W. F. [2 ]
机构
[1] McMaster Univ, Dept Biol, Hamilton, ON, Canada
[2] McMaster Univ, Dept Comp & Software, Hamilton, ON, Canada
[3] McMaster Univ, Dept Comp & Software, 1280 Main St West, Hamilton, ON L8S 4L8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
MAXCOVER; MUMmer; protein; repeats; string covers; ARRAY;
D O I
10.1089/cmb.2021.0520
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A partial cover of a string or sequence of length n, which we model as an array x=x[1..n], is a repeating substring u of x such that "many " positions in x lie within occurrences of u. A maximal cover u*-introduced in 2018 by Mhaskar and Smyth as optimal cover-is a partial cover that, over all partial covers u, maximizes the positions covered. Applying data structures also introduced by Mhaskar and Smyth, our software MAXCOVER for the first time enables efficient computation of u* for any x-in particular, as described here, for protein sequences of Arabidopsis, Caenorhabditis elegans, Drosophila melanogaster, and humans. In this protein context, we also compare an extended version of MAXCOVER with existing software (MUMmer's repeat-match) for the closely related task of computing non-extendible repeating substrings (a.k.a. maximal repeats). In practice, MAXCOVER is an order-of-magnitude faster than MUMmer, with much lower space requirements, while producing more compact output that, nevertheless, yields a more exact and user-friendly specification of the repeats.
引用
收藏
页码:149 / 160
页数:12
相关论文
共 50 条
  • [21] Computing maximal chains
    Marcone, Alberto
    Montalban, Antonio
    Shore, Richard A.
    ARCHIVE FOR MATHEMATICAL LOGIC, 2012, 51 (5-6) : 651 - 660
  • [22] Computing directed Steiner path covers
    Frank Gurski
    Dominique Komander
    Carolin Rehs
    Jochen Rethmann
    Egon Wanke
    Journal of Combinatorial Optimization, 2022, 43 : 402 - 431
  • [23] Computing directed Steiner path covers
    Gurski, Frank
    Komander, Dominique
    Rehs, Carolin
    Rethmann, Jochen
    Wanke, Egon
    JOURNAL OF COMBINATORIAL OPTIMIZATION, 2022, 43 (02) : 402 - 431
  • [24] Research on computing the covers of a given concept
    Wang, Can
    Yu, Xi
    Xu, Chunming
    He, Dandan
    Wang, Lijuan
    FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE IV, PTS 1-5, 2014, 496-500 : 1901 - 1904
  • [25] Computing String Covers in Sublinear Time
    Radoszewski, Jakub
    Zuba, Wiktor
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2024, 2025, 14899 : 272 - 288
  • [26] PPS: A computing engine to find Palindromes in all Protein sequences
    Ahmed, Zameer
    Gurusaran, Manickam
    Narayana, Prasanth
    Kumar, Kala Sekar Dinesh
    Mohanapriya, Jayapal
    Vaishnavi, Marthandan Kirti
    Sekar, Kanagaraj
    BIOINFORMATION, 2014, 10 (01) : 48 - 51
  • [27] GROUPS WITH MAXIMAL IRREDUNDANT COVERS AND MINIMAL BLOCKING SETS
    Abdollahi, Alireza
    ARS COMBINATORIA, 2014, 113 : 337 - 339
  • [28] Maximal C*-covers and residual finite-dimensionality
    Thompson, Ian
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2022, 514 (01)
  • [29] Lifting Artin–Schreier covers with maximal wild monodromy
    P. Chrétien
    Manuscripta Mathematica, 2014, 143 : 253 - 271
  • [30] Maximal. rings associated with covers of abelian groups
    Maxson, C. J.
    Smith, K. C.
    JOURNAL OF ALGEBRA, 2007, 315 (02) : 541 - 554