Comparative assessment of methods for aligning multiple genome sequences

被引:27
|
作者
Chen, Xiaoyu [1 ]
Tompa, Martin [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Dept Genome Sci, Seattle, WA 98195 USA
基金
美国国家卫生研究院; 加拿大自然科学与工程研究理事会;
关键词
NONCODING SEQUENCES; ALIGNMENT; UNCERTAINTY; VERTEBRATE; CONSTRAINT; DISCOVERY; THOUSANDS;
D O I
10.1038/nbt.1637
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Multiple sequence alignment is a difficult computational problem. There have been compelling pleas for methods to assess whole-genome multiple sequence alignments and compare the alignments produced by different tools. We assess the four ENCODE alignments, each of which aligns 28 vertebrates on 554 Mbp of total input sequence. We measure the level of agreement among the alignments and compare their coverage and accuracy. We find a disturbing lack of agreement among the alignments not only in species distant from human, but even in mouse, a well-studied model organism. Overall, the assessment shows that Pecan produces the most accurate or nearly most accurate alignment in all species and genomic location categories, while still providing coverage comparable to or better than that of the other alignments in the placental mammals. Our assessment reveals that constructing accurate whole-genome multiple sequence alignments remains a significant challenge, particularly for noncoding regions and distantly related species.
引用
收藏
页码:567 / U53
页数:8
相关论文
共 50 条
  • [31] Smooth operator: Aligning performance assessment methods with design and operating objectives
    Naef, Michelle
    Lefsrud, Lianne
    JOURNAL OF LOSS PREVENTION IN THE PROCESS INDUSTRIES, 2023, 85
  • [32] FramePlus: aligning DNA to protein sequences
    Halperin, E
    Faigler, S
    Gill-More, R
    BIOINFORMATICS, 1999, 15 (11) : 867 - 873
  • [33] A greedy algorithm for aligning DNA sequences
    Zhang, Z
    Schwartz, S
    Wagner, L
    Miller, W
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (1-2) : 203 - 214
  • [34] Aligning Sequences by Minimum Description Length
    Conery, John S.
    EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY, 2007, (01):
  • [35] Aligning Non-Overlapping Sequences
    Yaron Caspi
    Michal Irani
    International Journal of Computer Vision, 2002, 48 : 39 - 51
  • [36] Amazing matching sequences of olive genome with human genome; results of multiple alignments
    Senkal, M.
    Akalin, I.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2020, 28 (SUPPL 1) : 988 - 988
  • [37] Aligning non-overlapping sequences
    Caspi, Y
    Irani, M
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2002, 48 (01) : 39 - 51
  • [38] Framework for quality assessment of whole genome cancer sequences
    Justin P. Whalley
    Ivo Buchhalter
    Esther Rheinbay
    Keiran M. Raine
    Miranda D. Stobbe
    Kortine Kleinheinz
    Johannes Werner
    Sergi Beltran
    Marta Gut
    Daniel Hübschmann
    Barbara Hutter
    Dimitri Livitz
    Marc D. Perry
    Mara Rosenberg
    Gordon Saksena
    Jean-Rémi Trotta
    Roland Eils
    Daniela S. Gerhard
    Peter J. Campbell
    Matthias Schlesner
    Ivo G. Gut
    Nature Communications, 11
  • [39] A comparative assessment of classification methods
    Kiang, MY
    DECISION SUPPORT SYSTEMS, 2003, 35 (04) : 441 - 454
  • [40] Framework for quality assessment of whole genome cancer sequences
    Whalley, Justin P.
    Buchhalter, Ivo
    Rheinbay, Esther
    Raine, Keiran M.
    Stobbe, Miranda D.
    Kleinheinz, Kortine
    Werner, Johannes
    Beltran, Sergi
    Gut, Marta
    Huebschmann, Daniel
    Hutter, Barbara
    Livitz, Dimitri
    Perry, Marc D.
    Rosenberg, Mara
    Saksena, Gordon
    Trotta, Jean-Remi
    Eils, Roland
    Gerhard, Daniela S.
    Campbell, Peter J.
    Schlesner, Matthias
    Gut, Ivo G.
    NATURE COMMUNICATIONS, 2020, 11 (01)