Fast algorithms for large-scale genome alignment and comparison

被引:697
|
作者
Delcher, AL
Phillippy, A
Carlton, J
Salzberg, SL
机构
[1] Inst Gemom Res, Rockville, MD 20850 USA
[2] Loyola Coll, Dept Comp Sci, Baltimore, MD 21210 USA
[3] Celera Genom, Rockville, MD 20850 USA
[4] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
关键词
D O I
10.1093/nar/30.11.2478
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use of computer time and memory. The new system, MUMmer 2, runs three times faster while using one-third as much memory as the original MUMmer system. It has been used successfully to align the entire human and mouse genomes to each other, and to align numerous smaller eukaryotic and prokaryotic genomes. A new module permits the alignment of multiple DNA sequence fragments, which has proven valuable in the comparison of incomplete genome sequences. We also describe a method to align more distantly related genomes by detecting protein sequence homology. This extension to MUMmer aligns two genomes after translating the sequence in all six reading frames, extracts all matching protein sequences and then clusters together matches. This method has been applied to both incomplete and complete genome sequences in order to detect regions of conserved synteny, in which multiple proteins from one organism are found in the same order and orientation in another. The system code is being made freely available by the authors.
引用
收藏
页码:2478 / 2483
页数:6
相关论文
共 50 条
  • [1] Large-scale comparison of protein sequence alignment algorithms with structure alignments
    Sauder, JM
    Arthur, JW
    Dunbrack, RL
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2000, 40 (01) : 6 - 22
  • [2] COMPARISON OF FAST ALGORITHMS FOR ESTIMATING LARGE-SCALE PERMEABILITIES OF HETEROGENEOUS MEDIA
    MCCARTHY, JF
    [J]. TRANSPORT IN POROUS MEDIA, 1995, 19 (02) : 123 - 137
  • [3] Ultra-fast genome comparison for large-scale genomic experiments
    Perez-Wohlfeil, Esteban
    Diaz-del-Pino, Sergio
    Trelles, Oswaldo
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [4] Ultra-fast genome comparison for large-scale genomic experiments
    Esteban Pérez-Wohlfeil
    Sergio Diaz-del-Pino
    Oswaldo Trelles
    [J]. Scientific Reports, 9
  • [5] Fast algorithms for large-scale periodic structures
    Lu, WB
    Cui, TJ
    Qian, ZG
    Yin, XX
    Hong, W
    [J]. IEEE ANTENNAS AND PROPAGATION SOCIETY SYMPOSIUM, VOLS 1-4 2004, DIGEST, 2004, : 4463 - 4466
  • [6] Large-Scale Comparison Analysis of Genome Sequences
    Tang Haixu
    Ding Dafu(Shanghai Institute of Biochemistry
    [J]. 生物数学学报, 1997, (02) : 97 - 103
  • [7] In the fast lane: Large-scale bacterial genome engineering
    Feher, Tamas
    Burland, Valerie
    Posfai, Gyoergy
    [J]. JOURNAL OF BIOTECHNOLOGY, 2012, 160 (1-2) : 72 - 79
  • [8] Fast Algorithms for Large-Scale Generalized Distance Weighted Discrimination
    Lam, Xin Yee
    Marron, J. S.
    Sun, Defeng
    Toh, Kim-Chuan
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (02) : 368 - 379
  • [9] Large-Scale Characteristic Mode Analysis With Fast Multipole Algorithms
    Dai, Qi I.
    Wu, Junwei
    Gan, Hui
    Liu, Qin S.
    Chew, Weng Cho
    Sha, Wei E. I.
    [J]. IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2016, 64 (07) : 2608 - 2616
  • [10] Fast Overlapping Communities Detection Algorithms for Large-Scale Social Networks
    Li, Zheng-Lian
    Ji, Li-Xin
    Huang, Rui-Yang
    Lan, Ju-Long
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (02): : 257 - 265