Language models for protein design

被引:0
|
作者
Lee, Jin Sub [1 ]
Abdin, Osama [1 ]
Kim, Philip M. [1 ,2 ,3 ]
机构
[1] Univ Toronto, Dept Mol Genet, Toronto, ON M5S 1A8, Canada
[2] Univ Toronto, Donnelly Ctr Cellular & Biomol Res, Toronto, ON M5S 3E1, Canada
[3] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 2E4, Canada
关键词
D O I
10.1016/j.sbi.2025.103027
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The recent surge of large language models has shown that machines are capable of reading, understanding, and communicating through language, even sometimes displaying capabilities surpassing those of humans. Proteins can be represented as strings of amino acids akin to words in a sentence, and the same principles of language modeling can be used to learn informative representations for protein structure prediction, design, and property prediction. In this review, we will focus on applications of language modeling to protein design. We will first cover the foundations of protein language modeling and discuss recent advances such as contextconditioned design and structure integration. We also consider current shortcomings and promising avenues of research for protein language modeling to facilitate future development of improved protein language models for design.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] CONCEPTUAL DESIGN GENERATION USING LARGE LANGUAGE MODELS
    Ma, Kevin
    Grandi, Daniele
    McComb, Christopher
    Goucher-Lambert, Kosa
    PROCEEDINGS OF ASME 2023 INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, IDETC-CIE2023, VOL 6, 2023,
  • [42] Special Issue: Large Language Models in Design and Manufacturing
    Zhao, Yaoyao Fiona
    Niforatos, Evangelos
    Custis, Tonya
    Lu, Yan
    Luo, Jianxi
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2025, 25 (02)
  • [43] AutoTrial: Prompting Language Models for Clinical Trial Design
    Wang, Zifeng
    Xiao, Cao
    Sun, Jimeng
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12461 - 12472
  • [44] Understanding, Design, Models, Dialogue: The Orienting Role of Language
    Richards, Larry
    SHE JI-THE JOURNAL OF DESIGN ECONOMICS AND INNOVATION, 2019, 5 (04) : 369 - 372
  • [45] Protein–protein contact prediction by geometric triangle-aware protein language models
    Lin P.
    Tao H.
    Li H.
    Huang S.-Y.
    Nature Machine Intelligence, 2023, 5 (11) : 1275 - 1284
  • [46] Comparing Programming Language Models for Design Pattern Recognition
    Pandey, Sushant Kumar
    Staron, Miroslaw
    Horkoff, Jennifer
    Ochodek, Miroslaw
    Durisic, Darko
    IEEE 21ST INTERNATIONAL CONFERENCE ON SOFTWARE ARCHITECTURE COMPANION, ICSA-C 2024, 2024, : 183 - 190
  • [47] Leveraging large language models for peptide antibiotic design
    Guan, Changge
    Fernandes, Fabiano C.
    Franco, Octavio L.
    de la Fuente-nunez, Cesar
    CELL REPORTS PHYSICAL SCIENCE, 2025, 6 (01):
  • [48] Single-sequence protein structure prediction by integrating protein language models
    Jing, Xiaoyang
    Wu, Fandi
    Luo, Xiao
    Xu, Jinbo
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (13)
  • [49] In the twilight zone of protein sequence homology: do protein language models learn protein structure?
    Kabir, Anowarul
    Moldwin, Asher
    Bromberg, Yana
    Shehu, Amarda
    BIOINFORMATICS ADVANCES, 2024, 4 (01):
  • [50] Protein sequence design with deep generative models
    Wu, Zachary
    Johnston, Kadina E.
    Arnold, Frances H.
    Yang, Kevin K.
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 : 18 - 27