Simple and fast inverse alignment

被引:0
|
作者
Kececioglu, John [1 ]
Kim, Eagui [1 ]
机构
[1] Univ Arizona, Dept Comp Sci, Tucson, AZ 85721 USA
关键词
sequence analysis; parametric sequence alignment; substitution score matrices; affine gap penalties; supervised learning; linear programming; cutting plane algorithms;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
For as long as biologists have been computing alignments of sequences, the question of what values to use for scoring substitutions and gaps has persisted. While some choices for substitution scores are now common, largely due to convention, there is no standard for choosing gap penalties. An objective way to resolve this question is to learn the appropriate values by solving the Inverse String Alignment Problem: given examples of correct alignments, find parameter values that make the examples be optimal-scoring alignments of their strings. We present a new polynomial-time algorithm for Inverse String Alignment that is simple to implement, fast in practice, and for the first time can learn hundreds of parameters simultaneously. The approach is also flexible: minor modifications allow us to solve inverse unique alignment (find parameter values that make the examples be the unique optimal alignments of their strings), and inverse near-optimal alignment (find parameter values that make the example alignments be as close to optimal as possible). Computational results with an implementation for global alignment show that, for the first time, we can find best-possible values for all 212 parameters of the standard protein-sequence scoring-model from hundreds of alignments in a few minutes of computation.
引用
收藏
页码:441 / 455
页数:15
相关论文
共 50 条
  • [1] A Simple, Fast Strategy for Weighted Alignment Hypergraph
    Tu, Zhaopeng
    Xie, Jun
    Lv, Yajuan
    Liu, Qun
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 188 - 199
  • [2] Simple and fast alignment of metabolic pathways by exploiting local diversity
    Wernicke, Sebastian
    Rasche, Florian
    [J]. PROCEEDINGS OF THE 5TH ASIA- PACIFIC BIOINFOMATICS CONFERENCE 2007, 2007, 5 : 353 - +
  • [3] Simple and fast alignment of metabolic pathways by exploiting local diversity
    Wernicke, Sebastian
    Rasche, Florian
    [J]. BIOINFORMATICS, 2007, 23 (15) : 1978 - 1985
  • [4] Fast and simple automatic alignment of large sets of range maps
    Pingi, Paolo
    Corsini, Massimiliano
    Ganovelli, Fabio
    Scopigno, Roberto
    [J]. COMPUTERS & GRAPHICS-UK, 2015, 47 : 78 - 88
  • [5] Ditransitive alignment splits and inverse alignment
    Haspelmath, Martin
    [J]. FUNCTIONS OF LANGUAGE, 2007, 14 (01) : 79 - 102
  • [6] A simple algorithm for the fast calculation of higher order derivatives of the inverse function
    Dargazany, Roozbeh
    Hoernes, Karl
    Itskov, Mikhail
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2013, 221 : 833 - 838
  • [7] Inverse parametric sequence alignment
    Sun, FT
    Fernández-Baca, D
    Yu, W
    [J]. JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2004, 53 (01): : 36 - 54
  • [8] FAST OPTIMAL ALIGNMENT
    FICKETT, JW
    [J]. NUCLEIC ACIDS RESEARCH, 1984, 12 (01) : 175 - 179
  • [9] FAST OPTIMAL ALIGNMENT
    SPOUGE, JL
    [J]. COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1991, 7 (01): : 1 - 7
  • [10] Fast Statistical Alignment
    Bradley, Robert K.
    Roberts, Adam
    Smoot, Michael
    Juvekar, Sudeep
    Do, Jaeyoung
    Dewey, Colin
    Holmes, Ian
    Pachter, Lior
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (05)