Language models for protein design

被引:0
|
作者
Lee, Jin Sub [1 ]
Abdin, Osama [1 ]
Kim, Philip M. [1 ,2 ,3 ]
机构
[1] Univ Toronto, Dept Mol Genet, Toronto, ON M5S 1A8, Canada
[2] Univ Toronto, Donnelly Ctr Cellular & Biomol Res, Toronto, ON M5S 3E1, Canada
[3] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 2E4, Canada
关键词
D O I
10.1016/j.sbi.2025.103027
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The recent surge of large language models has shown that machines are capable of reading, understanding, and communicating through language, even sometimes displaying capabilities surpassing those of humans. Proteins can be represented as strings of amino acids akin to words in a sentence, and the same principles of language modeling can be used to learn informative representations for protein structure prediction, design, and property prediction. In this review, we will focus on applications of language modeling to protein design. We will first cover the foundations of protein language modeling and discuss recent advances such as contextconditioned design and structure integration. We also consider current shortcomings and promising avenues of research for protein language modeling to facilitate future development of improved protein language models for design.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Minimalist models for protein folding and design
    Head-Gordon, T
    Brown, S
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2003, 13 (02) : 160 - 167
  • [32] Investigating the utility of protein language models for modeling isoforms
    Zhang, Zhidian
    Wayment-Steele, Hannah
    Garyk, Brixi
    Sergey, Ovchinnikov
    PROTEIN SCIENCE, 2023, 32 (12)
  • [33] Boosting Protein Language Models with Negative Sample Mining
    Xu, Yaoyao
    Zhao, Xinjian
    Song, Xiaozhuang
    Wang, Benyou
    Yu, Tianshu
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT X, ECML PKDD 2024, 2024, 14950 : 199 - 214
  • [34] Protein language models guide directed antibody evolution
    Arunima Singh
    Nature Methods, 2023, 20 : 785 - 785
  • [35] Protein language models guide directed antibody evolution
    Singh, Arunima
    NATURE METHODS, 2023, 20 (06) : 785 - 785
  • [36] Design pattern recognition: a study of large language models
    Pandey, Sushant Kumar
    Chand, Sivajeet
    Horkoff, Jennifer
    Staron, Miroslaw
    Ochodek, Miroslaw
    Durisic, Darko
    EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (03)
  • [37] Opportunities for large language models and discourse in engineering design
    Goepfert, Jan
    Weinand, Jann M.
    Kuckertz, Patrick
    Stolten, Detlef
    ENERGY AND AI, 2024, 17
  • [38] Towards the holistic design of alloys with large language models
    Pei, Zongrui
    Yin, Junqi
    Neugebauer, Joerg
    Jain, Anubhav
    NATURE REVIEWS MATERIALS, 2024, 9 (12): : 840 - 841
  • [39] A modelling language for the design and execution of enterprise models in manufacturing
    Santos, JPO
    Ferreira, JJP
    Mendonça, JM
    INTERNATIONAL JOURNAL OF COMPUTER INTEGRATED MANUFACTURING, 2000, 13 (01) : 1 - 10
  • [40] LayoutPrompter: Awaken the Design Ability of Large Language Models
    Lin, Jiawei
    Guo, Jiaqi
    Sun, Shizhao
    Yang, Zijiang James
    Lou, Jian-Guang
    Zhang, Dongmei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,