Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega

被引:11059
|
作者
Sievers, Fabian [1 ]
Wilm, Andreas [2 ]
Dineen, David [1 ]
Gibson, Toby J. [3 ]
Karplus, Kevin [4 ]
Li, Weizhong [5 ]
Lopez, Rodrigo [5 ]
McWilliam, Hamish [5 ]
Remmert, Michael [6 ]
Soeding, Johannes [6 ]
Thompson, Julie D. [7 ]
Higgins, Desmond G. [1 ]
机构
[1] Univ Coll Dublin, UCD Conway Inst Biomol & Biomed Res, Sch Med & Med Sci, Dublin 4, Ireland
[2] Genome Inst Singapore, Singapore, Singapore
[3] European Mol Biol Lab, Struct & Computat Biol Unit, Heidelberg, Germany
[4] Univ Calif Santa Cruz, Dept Biomol Engn, Santa Cruz, CA 95064 USA
[5] European Bioinformat Inst, EMBL Outstn, Cambridge, England
[6] Univ Munich LMU, Gene Ctr Munich, Munich, Germany
[7] Univ Strasbourg, Dept Biol Struct & Genom, IGBMC, CNRS,INSERM, Illkirch Graffenstaden, France
基金
爱尔兰科学基金会;
关键词
bioinformatics; hidden Markov models; multiple sequence alignment; CONSTRUCTION; ALGORITHM; ACCURATE; COFFEE; TREES;
D O I
10.1038/msb.2011.75
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high-quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high-quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam. Molecular Systems Biology 7: 539; published online 11 October 2011; doi:10.1038/msb.2011.75
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Layer-mesh-based holograms for fast generation and high-quality reconstruction
    Yao, Yongwei
    Zhang, Yaping
    Poon, Ting-Chung
    OPTICS AND LASERS IN ENGINEERING, 2024, 175
  • [32] Contour wavelet diffusion - a fast and high-quality facial expression generation model
    Xu, Chenwei
    Zou, Yuntao
    CONNECTION SCIENCE, 2024, 36 (01)
  • [33] EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation
    Zhou, Wenyang
    Dou, Zhiyang
    Cao, Zeyu
    Liao, Zhouyingcheng
    Wang, Jingbo
    Wang, Wenjia
    Liu, Yuan
    Komura, Taku
    Wang, Wenping
    Liu, Lingjie
    COMPUTER VISION - ECCV 2024, PT II, 2025, 15060 : 18 - 38
  • [34] The fast generation method based on lattice segmentation for high-quality confusion network
    Wang H.
    Han J.
    Gaojishu Tongxin/Chinese High Technology Letters, 2010, 20 (05): : 473 - 480
  • [35] A fast local mesh generation method about high-quality node set
    Chen, Wei-Wei
    Nie, Yu-Feng
    Zhang, Wei-Wei
    Wang, Lei
    Jisuan Lixue Xuebao/Chinese Journal of Computational Mechanics, 2012, 29 (05): : 704 - 709
  • [36] High-quality sequence clustering guided by network topology and multiple alignment likelihood
    Miele, Vincent
    Penel, Simon
    Daubin, Vincent
    Picard, Franck
    Kahn, Daniel
    Duret, Laurent
    BIOINFORMATICS, 2012, 28 (08) : 1078 - 1085
  • [37] Generation of High-Quality Image Using Generative Adversarial Network
    Sun, Yitao
    2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING, AND INTELLIGENT COMPUTING (CAMMIC 2022), 2022, 12259
  • [38] HIGH-QUALITY SYNTHETIC SPEECH GENERATION USING SYNCHRONIZED OSCILLATORS
    HASHIMOTO, K
    MOCHIDA, T
    SATO, Y
    KOBAYASHI, T
    SHIRAI, K
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1949 - 1956
  • [39] Bona fide predictions of protein secondary structure using transparent analyses of multiple sequence alignments
    Benner, SA
    Cannarozzi, G
    Gerloff, D
    Turcotte, M
    Chelvanayagam, G
    CHEMICAL REVIEWS, 1997, 97 (08) : 2725 - 2843
  • [40] Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments
    Riis, SK
    Krogh, A
    JOURNAL OF COMPUTATIONAL BIOLOGY, 1996, 3 (01) : 163 - 183