Fast algorithms for large-scale genome alignment and comparison

被引:695
|
作者
Delcher, AL
Phillippy, A
Carlton, J
Salzberg, SL
机构
[1] Inst Gemom Res, Rockville, MD 20850 USA
[2] Loyola Coll, Dept Comp Sci, Baltimore, MD 21210 USA
[3] Celera Genom, Rockville, MD 20850 USA
[4] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
关键词
D O I
10.1093/nar/30.11.2478
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use of computer time and memory. The new system, MUMmer 2, runs three times faster while using one-third as much memory as the original MUMmer system. It has been used successfully to align the entire human and mouse genomes to each other, and to align numerous smaller eukaryotic and prokaryotic genomes. A new module permits the alignment of multiple DNA sequence fragments, which has proven valuable in the comparison of incomplete genome sequences. We also describe a method to align more distantly related genomes by detecting protein sequence homology. This extension to MUMmer aligns two genomes after translating the sequence in all six reading frames, extracts all matching protein sequences and then clusters together matches. This method has been applied to both incomplete and complete genome sequences in order to detect regions of conserved synteny, in which multiple proteins from one organism are found in the same order and orientation in another. The system code is being made freely available by the authors.
引用
下载
收藏
页码:2478 / 2483
页数:6
相关论文
共 50 条
  • [42] Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters
    Lan, Haidong
    Chan, Yuandong
    Xu, Kai
    Schmidt, Bertil
    Peng, Shaoliang
    Liu, Weiguo
    BMC BIOINFORMATICS, 2016, 17
  • [43] Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters
    Haidong Lan
    Yuandong Chan
    Kai Xu
    Bertil Schmidt
    Shaoliang Peng
    Weiguo Liu
    BMC Bioinformatics, 17
  • [44] Fast Large-Scale Trajectory Clustering
    Wang, Sheng
    Bao, Zhifeng
    Culpepper, J. Shane
    Sellis, Timos
    Qin, Xiaolin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 13 (01): : 29 - 42
  • [45] Fast large-scale reionization simulations
    Thomas, Rajat M.
    Zaroubi, Saleem
    Ciardi, Benedetta
    Pawlik, Andreas H.
    Labropoulos, Panagiotis
    Jelic, Vibor
    Bernardi, Gianni
    Brentjens, Michiel A.
    de Bruyn, A. G.
    Harker, Geraint J. A.
    Koopmans, Leon V. E.
    Mellema, Garrelt
    Pandey, V. N.
    Schaye, Joop
    Yatawatta, Sarod
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2009, 393 (01) : 32 - 48
  • [46] Scalable Algorithms for Bayesian Inference of Large-Scale Models from Large-Scale Data
    Ghattas, Omar
    Isaac, Tobin
    Petra, Noemi
    Stadler, Georg
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 3 - 6
  • [47] A New Genome-to-Genome Comparison Approach for Large-Scale Revisiting of Current Microbial Taxonomy
    Tsai, Ming-Hsin
    Liu, Yen-Yi
    Soo, Von-Wun
    Chen, Chih-Chieh
    MICROORGANISMS, 2019, 7 (06)
  • [48] Fast algorithms for large-scale periodic structures using subentire domain basis functions
    Lu, WB
    Cui, TJ
    Yin, XX
    Qian, ZG
    Hong, W
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2005, 53 (03) : 1154 - 1162
  • [49] Fast and Message-Efficient Global Snapshot Algorithms for Large-Scale Distributed Systems
    Kshemkalyani, Ajay D.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2010, 21 (09) : 1281 - 1289
  • [50] Fast Local Alignment of Protein Pockets (FLAPP): A System-Compiled Program for Large-Scale Binding Site Alignment
    Sankar, Santhosh
    Sakthivel, Naren Chandran
    Chandra, Nagasuma
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, : 4810 - 4819