Large-scale structure-informed multiple sequence alignment of proteins with SIMSApiper

被引:0
|
作者
Crauwels, Charlotte [1 ,2 ,3 ]
Heidig, Sophie-Luise [1 ,4 ]
Diaz, Adrian [1 ,2 ,3 ]
Vranken, Wim F. [1 ,2 ,3 ]
机构
[1] ULB VUB, Interuniv Inst Bioinformat Brussels, B-1050 Brussels, Belgium
[2] Vrije Univ Brussel, Struct Biol Brussels, B-1050 Brussels, Belgium
[3] Vrije Univ Brussel, AI Lab, B-1050 Brussels, Belgium
[4] Univ Libre Bruxelles, Evolutionary Biol & Ecol, B-1050 Brussels, Belgium
关键词
CRYSTAL-STRUCTURE; DATABASE; TIM;
D O I
10.1093/bioinformatics/btae276
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
SIMSApiper is a Nextflow pipeline that creates reliable, structure-informed MSAs of thousands of protein sequences faster than standard structure-based alignment methods. Structural information can be provided by the user or collected by the pipeline from online resources. Parallelization with sequence identity-based subsets can be activated to significantly speed up the alignment process. Finally, the number of gaps in the final alignment can be reduced by leveraging the position of conserved secondary structure elements.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Accelerated large-scale multiple sequence alignment
    Scott Lloyd
    Quinn O Snell
    [J]. BMC Bioinformatics, 12
  • [2] Accelerated large-scale multiple sequence alignment
    Lloyd, Scott
    Snell, Quinn O.
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [3] A PARALLEL ALGORITHM FOR LARGE-SCALE MULTIPLE SEQUENCE ALIGNMENT
    Lopes, Heitor S.
    Erig Lima, Carlos R.
    Moritz, Guilherme L.
    [J]. COMPUTING AND INFORMATICS, 2010, 29 (06) : 1233 - 1250
  • [4] Multiple sequence alignment: a major challenge to large-scale phylogenetics
    Liu, Kevin
    Linder, C. Randal
    Warnow, Tandy
    [J]. PLOS CURRENTS-TREE OF LIFE, 2010,
  • [5] Large-Scale Multiple Sequence Alignment and the Maximum Weight Trace Alignment Merging Problem
    Zaharias, Paul
    Smirnov, Vladimir
    Warnow, Tandy
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (03) : 1700 - 1712
  • [6] Multi-GPU Approach for Large-Scale Multiple Sequence Alignment
    Siqueira, Rodrigo A. de O.
    Stefanes, Marco A.
    Rozante, Luiz C. S.
    Martins-Jr, David C.
    de Souza, Jorge E. S.
    Araujo, Eloi
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT I, 2021, 12949 : 560 - 575
  • [7] Large-scale comparison of protein sequence alignment algorithms with structure alignments
    Sauder, JM
    Arthur, JW
    Dunbrack, RL
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2000, 40 (01) : 6 - 22
  • [8] DAWN: RAPID LARGE-SCALE PROTEIN MULTIPLE SEQUENCE ALIGNMENT AND CONSERVATION ANALYSIS
    Ricke, Darrell O.
    Shcherbina, Anna
    [J]. 2015 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2015,
  • [9] Large-Scale Multiple Sequence Alignment Visualization through Gradient Vector Flow Analysis
    Khoa Tan Nguyen
    Ropinski, Timo
    [J]. 2013 IEEE SYMPOSIUM ON BIOLOGICAL DATA VISUALIZATION (BIOVIS), 2013, : 9 - 16
  • [10] Alignment between galaxies and large-scale structure
    A.Faltenbacher
    Simon D. M. White
    [J]. Research in Astronomy and Astrophysics, 2009, 9 (01) : 41 - 58