Language models for protein design

被引:0
|
作者
Lee, Jin Sub [1 ]
Abdin, Osama [1 ]
Kim, Philip M. [1 ,2 ,3 ]
机构
[1] Univ Toronto, Dept Mol Genet, Toronto, ON M5S 1A8, Canada
[2] Univ Toronto, Donnelly Ctr Cellular & Biomol Res, Toronto, ON M5S 3E1, Canada
[3] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 2E4, Canada
关键词
D O I
10.1016/j.sbi.2025.103027
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The recent surge of large language models has shown that machines are capable of reading, understanding, and communicating through language, even sometimes displaying capabilities surpassing those of humans. Proteins can be represented as strings of amino acids akin to words in a sentence, and the same principles of language modeling can be used to learn informative representations for protein structure prediction, design, and property prediction. In this review, we will focus on applications of language modeling to protein design. We will first cover the foundations of protein language modeling and discuss recent advances such as contextconditioned design and structure integration. We also consider current shortcomings and promising avenues of research for protein language modeling to facilitate future development of improved protein language models for design.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Controllable protein design with language models
    Noelia Ferruz
    Birte Höcker
    Nature Machine Intelligence, 2022, 4 : 521 - 532
  • [2] Controllable protein design with language models
    Ferruz, Noelia
    Hoecker, Birte
    NATURE MACHINE INTELLIGENCE, 2022, 4 (06) : 521 - 532
  • [3] The promises of large language models for protein design and modeling
    Valentini, Giorgio
    Malchiodi, Dario
    Gliozzo, Jessica
    Mesiti, Marco
    Soto-Gomez, Mauricio
    Cabri, Alberto
    Reese, Justin
    Casiraghi, Elena
    Robinson, Peter N.
    FRONTIERS IN BIOINFORMATICS, 2023, 3
  • [4] Current progress, challenges, and future perspectives of language models for protein representation and protein design
    Huang, Tao
    Li, Yixue
    INNOVATION, 2023, 4 (04):
  • [5] Benchmarking protein language models for protein crystallization
    Mall, Raghvendra
    Kaushik, Rahul
    Martinez, Zachary A.
    Thomson, Matt W.
    Castiglione, Filippo
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [6] A Visual Language for Protein Design
    Cox, Robert Sidney, III
    McLaughlin, James Alastair
    Gruenberg, Raik
    Beal, Jacob
    Wipat, Anil
    Sauro, Herbert M.
    ACS SYNTHETIC BIOLOGY, 2017, 6 (07): : 1120 - 1123
  • [7] Protein language models using convolutions
    Tang, Lin
    NATURE METHODS, 2024, 21 (04) : 550 - 550
  • [8] Chemical language models for molecular design
    Bajorath, Juergen
    MOLECULAR INFORMATICS, 2024, 43 (01)
  • [9] A tool for automated design of language models
    Yang, YP
    Deller, JR
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 518 - 521
  • [10] GPCR-BERT: Interpreting Sequential Design of G Protein-Coupled Receptors Using Protein Language Models
    Kim, Seongwon
    Mollaei, Parisa
    Antony, Akshay
    Magar, Rishikesh
    Barati Farimani, Amir
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (04) : 1134 - 1144