Comparative assessment of methods for aligning multiple genome sequences

被引:27
|
作者
Chen, Xiaoyu [1 ]
Tompa, Martin [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Dept Genome Sci, Seattle, WA 98195 USA
基金
美国国家卫生研究院; 加拿大自然科学与工程研究理事会;
关键词
NONCODING SEQUENCES; ALIGNMENT; UNCERTAINTY; VERTEBRATE; CONSTRAINT; DISCOVERY; THOUSANDS;
D O I
10.1038/nbt.1637
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Multiple sequence alignment is a difficult computational problem. There have been compelling pleas for methods to assess whole-genome multiple sequence alignments and compare the alignments produced by different tools. We assess the four ENCODE alignments, each of which aligns 28 vertebrates on 554 Mbp of total input sequence. We measure the level of agreement among the alignments and compare their coverage and accuracy. We find a disturbing lack of agreement among the alignments not only in species distant from human, but even in mouse, a well-studied model organism. Overall, the assessment shows that Pecan produces the most accurate or nearly most accurate alignment in all species and genomic location categories, while still providing coverage comparable to or better than that of the other alignments in the placental mammals. Our assessment reveals that constructing accurate whole-genome multiple sequence alignments remains a significant challenge, particularly for noncoding regions and distantly related species.
引用
收藏
页码:567 / U53
页数:8
相关论文
共 50 条
  • [1] Comparative assessment of methods for aligning multiple genome sequences
    Xiaoyu Chen
    Martin Tompa
    Nature Biotechnology, 2010, 28 : 567 - 572
  • [2] Aligning multiple sequences by genetic algorithm
    Liu, LF
    Huo, HW
    Wang, BS
    2004 INTERNATIONAL CONFERENCE ON COMMUNICATION, CIRCUITS, AND SYSTEMS, VOLS 1 AND 2: VOL 1: COMMUNICATION THEORY AND SYSTEMS, 2004, : 994 - 998
  • [3] Aligning sequences from multiple cameras
    Korah, T
    Rasmussen, C
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 941 - 944
  • [4] GLProbs: Aligning Multiple Sequences Adaptively
    Ye, Yongtao
    Cheung, David Wai-Lok
    Wang, Yadong
    Yiu, Siu-Ming
    Zhang, Qing
    Lam, Tak-Wah
    Ting, Hing-Fung
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (01) : 67 - 78
  • [5] Bioinformatics methods for the comparative analysis of metazoan mitochondrial genome sequences
    Bernt, Matthias
    Braband, Anke
    Middendorf, Martin
    Misof, Bernhard
    Rota-Stabelli, Omar
    Stadler, Peter F.
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2013, 69 (02) : 320 - 327
  • [6] Aligning multiple genomic sequences with the threaded blockset aligner
    Blanchette, M
    Kent, WJ
    Riemer, C
    Elnitski, L
    Smit, AFA
    Roskin, KM
    Baertsch, R
    Rosenbloom, K
    Clawson, H
    Green, ED
    Haussler, D
    Miller, W
    GENOME RESEARCH, 2004, 14 (04) : 708 - 715
  • [7] A Parallel GWO Technique for Aligning Multiple Molecular Sequences
    Jayapriya, J.
    Arock, Michael
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 210 - 215
  • [8] A fast algorithm aligning multiple microbial genomic sequences
    Hu, Guangyue
    Shen, Shiyi
    2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 240 - 243
  • [9] ON MULTIPLE ALIGNMENT OF GENOME SEQUENCES
    OHYA, M
    MIYAZAKI, S
    OGATA, K
    IEICE TRANSACTIONS ON COMMUNICATIONS, 1992, E75B (06) : 453 - 457
  • [10] Comparative analyses of whole-genome protein sequences from multiple organisms
    Yokono, Makio
    Satoh, Soichirou
    Tanaka, Ayumi
    SCIENTIFIC REPORTS, 2018, 8