Unsupervised evolution of protein and antibody complexes with a structure-informed language model

被引:11
|
作者
Shanker, Varun R. [1 ,2 ,3 ]
Bruun, Theodora U. J. [2 ,3 ,4 ]
Hie, Brian L. [3 ,4 ,6 ,7 ,8 ]
Kim, Peter S. [3 ,4 ,5 ]
机构
[1] Stanford Univ, Sch Med, Stanford Biophys Program, Stanford, CA 94305 USA
[2] Stanford Univ, Sch Med, Stanford Med Scientist Training Program, Stanford, CA 94305 USA
[3] Stanford Univ, Sarafan ChEM H, Stanford, CA 94305 USA
[4] Stanford Univ, Sch Med, Dept Biochem, Stanford, CA 94305 USA
[5] Chan Zuckerberg Biohub, San Francisco, CA 94158 USA
[6] Stanford Univ, Dept Chem Engn, Stanford, CA 94305 USA
[7] Stanford Univ, Stanford Data Sci, Stanford, CA 94305 USA
[8] Arc Inst, Palo Alto, CA 94304 USA
关键词
FITNESS LANDSCAPES; SEQUENCE; DESIGN; SELECTION; RECOGNITION; INHIBITION; GENERATION; REVEALS; SET;
D O I
10.1126/science.adk8946
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.
引用
收藏
页码:46 / 53
页数:8
相关论文
共 50 条
  • [41] Co-evolution at protein-protein interfaces guides inference of stoichiometry of oligomeric protein complexes by de novo structure prediction
    Kilian, Max
    Bischofs, Ilka B.
    MOLECULAR MICROBIOLOGY, 2023, 120 (05) : 763 - 782
  • [42] Structure-free antibody paratope similarity prediction for in silico epitope binning via protein language models
    Ghanbarpour, Ahmadreza
    Jiang, Min
    Foster, Denisa
    Chai, Qing
    ISCIENCE, 2023, 26 (02)
  • [43] Single-sequence protein structure prediction using a language model and deep learning
    Ratul Chowdhury
    Nazim Bouatta
    Surojit Biswas
    Christina Floristean
    Anant Kharkar
    Koushik Roy
    Charlotte Rochereau
    Gustaf Ahdritz
    Joanna Zhang
    George M. Church
    Peter K. Sorger
    Mohammed AlQuraishi
    Nature Biotechnology, 2022, 40 : 1617 - 1623
  • [44] Evolutionary-scale prediction of atomic-level protein structure with a language model
    Lin, Zeming
    Akin, Halil
    Rao, Roshan
    Hie, Brian
    Zhu, Zhongkai
    Lu, Wenting
    Smetanin, Nikita
    Verkuil, Robert
    Kabeli, Ori
    Shmueli, Yaniv
    Costa, Allan dos Santos
    Fazel-Zarandi, Maryam
    Sercu, Tom
    Candido, Salvatore
    Rives, Alexander
    SCIENCE, 2023, 379 (6637) : 1123 - 1130
  • [45] Single-sequence protein structure prediction using a language model and deep learning
    Chowdhury, Ratul
    Bouatta, Nazim
    Biswas, Surojit
    Floristean, Christina
    Kharkare, Anant
    Roye, Koushik
    Rochereau, Charlotte
    Ahdritz, Gustaf
    Zhang, Joanna
    Church, George M.
    Sorger, Peter K.
    AlQuraishi, Mohammed
    NATURE BIOTECHNOLOGY, 2022, 40 (11) : 1617 - +
  • [46] AUTOMATED STRUCTURE DISCOVERY AND PARAMETER TUNING OF NEURAL NETWORK LANGUAGE MODEL BASED ON EVOLUTION STRATEGY
    Tanaka, Tomohiro
    Moriya, Takafumi
    Shinozaki, Takahiro
    Watanabe, Shinji
    Hori, Takaaki
    Duh, Kevin
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 665 - 671
  • [47] Using Large Language Model to Optimize Protein Purification: Insights from Protein Structure Literature Associated with Protein Data Bank
    Chen, Zhuojian
    Sivaraman, J.
    ADVANCED SCIENCE, 2025,
  • [48] Ligand-Binding-Site Structure Shapes Allosteric Signal Transduction and the Evolution of Allostery in Protein Complexes
    Abrusan, Gyorgy
    Marsh, Joseph A.
    MOLECULAR BIOLOGY AND EVOLUTION, 2019, 36 (08) : 1711 - 1727
  • [50] Accurate prediction of antibody function and structure using bio-inspired antibody language model (vol 25, bbae245, 2024)
    Jing, Hongtai
    Gao, Zhengtao
    Xu, Sheng
    Shen, Tao
    Peng, Zhangzhi
    He, Shwai
    You, Tao
    Ye, Shuang
    Lin, Wei
    Sun, Siqi
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (05)