QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families

被引:7
|
作者
Gudys, Adam [1 ]
Deorowicz, Sebastian [1 ]
机构
[1] Silesian Tech Univ, Inst Informat, Akad 16, PL-44100 Gliwice, Poland
来源
SCIENTIFIC REPORTS | 2017年 / 7卷
关键词
MULTIPLE SEQUENCE ALIGNMENT; GUIDE TREES; ACCURACY; IMPROVEMENT; ALGORITHMS; DATABASE; MODELS; MAFFT;
D O I
10.1038/srep41553
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalO and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalO, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] HIGH-QUALITY OXIDES FOR LARGE AREA DISPLAYS
    JOHNSON, LM
    SAUNDERS, M
    MEAKIN, DB
    JOURNAL DE PHYSIQUE IV, 1991, 1 (C2): : 505 - 505
  • [32] Towards high-quality, high-speed numerical computation
    Hains, G
    vanEmden, MH
    1997 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, VOLS 1 AND 2: PACRIM 10 YEARS - 1987-1997, 1997, : 265 - 268
  • [33] RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families
    Wei, Qingting
    Zou, Hong
    Zhong, Cuncong
    Xu, Jianfeng
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2022, 20 (04)
  • [34] A route towards the fabrication of large-scale and high-quality perovskite films for optoelectronic devices
    Ehsan Rezaee
    Dimitar Kutsarov
    Bowei Li
    Jinxin Bi
    S. Ravi P. Silva
    Scientific Reports, 12
  • [35] A route towards the fabrication of large-scale and high-quality perovskite films for optoelectronic devices
    Rezaee, Ehsan
    Kutsarov, Dimitar
    Li, Bowei
    Bi, Jinxin
    Silva, S. Ravi P.
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [36] Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data
    Fu, Gang
    Zhang, Qing
    Zhu, Lei
    Xiao, Chunxia
    Li, Ping
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12811 - 12819
  • [37] Construction of High-Quality Rice Ribosome Footprint Library
    Yang, Xiaoyu
    Cui, Jie
    Song, Bo
    Yu, Yu
    Mo, Beixin
    Liu, Lin
    FRONTIERS IN PLANT SCIENCE, 2020, 11
  • [38] Towards a coherent and high-quality science policy on biodiversity
    Aline van der Werf
    Hydrobiologia, 2005, 542 : 35 - 37
  • [39] Towards High-Quality Photorealistic Image Style Transfer
    Ding, Hong
    Zhang, Haimin
    Fu, Gang
    Jiang, Caoqing
    Luo, Fei
    Xiao, Chunxia
    Xu, Min
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9892 - 9905
  • [40] DriveGAN: Towards a Controllable High-Quality Neural Simulation
    Kim, Seung Wook
    Philion, Jonah
    Torralba, Antonio
    Fidler, Sanja
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5816 - 5825