Reciprocal best structure hits: using AlphaFold models to discover distant homologues

被引:16
|
作者
Monzon, Vivian [1 ]
Paysan-Lafosse, Typhaine [1 ]
Wood, Valerie [2 ]
Bateman, Alex [1 ]
机构
[1] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Wellcome Genome Campus, Hinxton CB21 4HH, England
[2] Univ Cambridge, Dept Biochem, Cambridge CB2 1GA, England
来源
BIOINFORMATICS ADVANCES | 2022年 / 2卷 / 01期
基金
英国惠康基金; 英国生物技术与生命科学研究理事会;
关键词
PROTEIN; SEQUENCE; GENE;
D O I
10.1093/bioadv/vbac072
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivation The conventional methods to detect homologous protein pairs use the comparison of protein sequences. But the sequences of two homologous proteins may diverge significantly and consequently may be undetectable by standard approaches. The release of the AlphaFold 2.0 software enables the prediction of highly accurate protein structures and opens many opportunities to advance our understanding of protein functions, including the detection of homologous protein structure pairs.Results In this proof-of-concept work, we search for the closest homologous protein pairs using the structure models of five model organisms from the AlphaFold database. We compare the results with homologous protein pairs detected by their sequence similarity and show that the structural matching approach finds a similar set of results. In addition, we detect potential novel homologs solely with the structural matching approach, which can help to understand the function of uncharacterized proteins and make previously overlooked connections between well-characterized proteins. We also observe limitations of our implementation of the structure-based approach, particularly when handling highly disordered proteins or short protein structures. Our work shows that high accuracy protein structure models can be used to discover homologous protein pairs, and we expose areas for improvement of this structural matching approach.Availability and Implementation Information to the discovered homologous protein pairs can be found at the following URL: https://doi.org/10.17863/CAM.87873. The code can be accessed here: https://github.com/VivianMonzon/Reciprocal_Best_Structure_Hits.Supplementary information are available at Bioinformatics Advances online.
引用
收藏
页数:9
相关论文
共 5 条
  • [1] Protein Design Using Structure-Prediction Networks: AlphaFold and RoseTTAFold as Protein Structure Foundation Models
    Wang, Jue
    Watson, Joseph L.
    Lisanza, Sidney L.
    COLD SPRING HARBOR PERSPECTIVES IN BIOLOGY, 2024, 16 (07):
  • [2] FISH -: family identification of sequence homologues using structure anchored hidden Markov models
    Tangrot, Jeanette
    Wang, Lixiao
    Kagstrom, Bo
    Sauer, Uwe H.
    NUCLEIC ACIDS RESEARCH, 2006, 34 : W10 - W14
  • [3] Structure solution with ARCIMBOLDO using fragments derived from distant homology models
    Sammito, Massimo
    Meindl, Kathrin
    de Ilarduya, Inaki M.
    Millan, Claudia
    Artola-Recolons, Cecilia
    Hermoso, Juan A.
    Uson, Isabel
    FEBS JOURNAL, 2014, 281 (18) : 4029 - 4045
  • [4] GPCRdb in 2023: state-specific structure models using AlphaFold2 and new ligand resources
    Pandy-Szekeres, Gaspar
    Caroli, Jimmy
    Mamyrbekov, Alibek
    Kermani, Ali A.
    Keseru, Gyorgy M.
    Kooistra, Albert J.
    Gloriam, David E.
    NUCLEIC ACIDS RESEARCH, 2023, 51 (D1) : D395 - D402
  • [5] Nothing Beats Good Data - Lessons Learned from Native-SAD Data Collection Can Give the Best Crystal Structure from Alphafold Molecular Replacement Models
    Rose, John P.
    Zhou, Dayong
    Chen, Lirong
    Wang, Bi-Cheng
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2023, 79 : A287 - A287