Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega

被引:11059
|
作者
Sievers, Fabian [1 ]
Wilm, Andreas [2 ]
Dineen, David [1 ]
Gibson, Toby J. [3 ]
Karplus, Kevin [4 ]
Li, Weizhong [5 ]
Lopez, Rodrigo [5 ]
McWilliam, Hamish [5 ]
Remmert, Michael [6 ]
Soeding, Johannes [6 ]
Thompson, Julie D. [7 ]
Higgins, Desmond G. [1 ]
机构
[1] Univ Coll Dublin, UCD Conway Inst Biomol & Biomed Res, Sch Med & Med Sci, Dublin 4, Ireland
[2] Genome Inst Singapore, Singapore, Singapore
[3] European Mol Biol Lab, Struct & Computat Biol Unit, Heidelberg, Germany
[4] Univ Calif Santa Cruz, Dept Biomol Engn, Santa Cruz, CA 95064 USA
[5] European Bioinformat Inst, EMBL Outstn, Cambridge, England
[6] Univ Munich LMU, Gene Ctr Munich, Munich, Germany
[7] Univ Strasbourg, Dept Biol Struct & Genom, IGBMC, CNRS,INSERM, Illkirch Graffenstaden, France
基金
爱尔兰科学基金会;
关键词
bioinformatics; hidden Markov models; multiple sequence alignment; CONSTRUCTION; ALGORITHM; ACCURATE; COFFEE; TREES;
D O I
10.1038/msb.2011.75
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high-quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high-quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam. Molecular Systems Biology 7: 539; published online 11 October 2011; doi:10.1038/msb.2011.75
引用
收藏
页数:6
相关论文
共 50 条
  • [21] eProbalign: generation and manipulation of multiple sequence alignments using partition function posterior probabilities
    Chikkagoudar, Satish
    Roshan, Usman
    Livesay, Dennis
    NUCLEIC ACIDS RESEARCH, 2007, 35 : W675 - W677
  • [22] MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space
    Comas-Massague, Armand
    Qiu, Di
    Chai, Menglei
    Buhler, Marcel
    Raj, Amit
    Gao, Ruiqi
    Xu, Qiangeng
    Matthews, Mark
    Gottardo, Paulo
    Orts-Escolano, Sergio
    Beeler, Thabo
    COMPUTER VISION - ECCV 2024, PT LXVI, 2025, 15124 : 178 - 196
  • [23] Contour wavelet diffusion: A fast and high-quality image generation model
    Ding, Yaoyao
    Zhu, Xiaoxi
    Zou, Yuntao
    COMPUTATIONAL INTELLIGENCE, 2024, 40 (02)
  • [24] High-quality isosurface generation using an oversampling method
    Jae Hun Ryu
    Hong Seok Byun
    Kwan H. Lee
    The International Journal of Advanced Manufacturing Technology, 2006, 28 : 1161 - 1168
  • [25] High-quality isosurface generation using an oversampling method
    Ryu, JH
    Byun, HS
    Lee, KH
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2006, 28 (11-12): : 1161 - 1168
  • [26] High-quality isosurface generation using an oversampling method
    Ryu, Jae Hun
    Byun, Hong Seok
    Lee, Kwan H.
    International Journal of Advanced Manufacturing Technology, 2006, 28 (11-12): : 1161 - 1168
  • [27] Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset
    Shi, Ming-Guang
    Xia, Jun-Feng
    Li, Xue-Ling
    Huang, De-Shuang
    AMINO ACIDS, 2010, 38 (03) : 891 - 899
  • [28] EDITtoTrEMBL:: a distributed approach to high-quality automated protein sequence annotation
    Möller, S
    Leser, U
    Fleischmann, W
    Apweiler, R
    BIOINFORMATICS, 1999, 15 (03) : 219 - 227
  • [29] HIGH-QUALITY MULTI-VIEW DEPTH GENERATION USING MULTIPLE COLOR AND DEPTH CAMERAS
    Kang, Yun-Suk
    Ho, Yo-Sung
    2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 1405 - 1408
  • [30] Diverse Distractor Generation for Constructing High-Quality Multiple Choice Questions
    Xie, Jiayuan
    Peng, Ningxin
    Cai, Yi
    Wang, Tao
    Huang, Qingbao
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 280 - 291