Integration of pre-trained protein language models into geometric deep learning networks

被引:10
|
作者
Wu, Fang [1 ]
Wu, Lirong [1 ]
Radev, Dragomir [2 ]
Xu, Jinbo [3 ,4 ]
Li, Stan Z. [1 ]
机构
[1] Westlake Univ, AI Res & Innovat Lab, Hangzhou 310030, Peoples R China
[2] Yale Univ, Dept Comp Sci, New Haven, CT 06511 USA
[3] Tsinghua Univ, Inst AI Ind Res, Haidian St, Beijing 100084, Peoples R China
[4] Toyota Technol Inst Chicago, Chicago, IL 60637 USA
关键词
PREDICTION; COLLECTION; BENCHMARK;
D O I
10.1038/s42003-023-05133-1
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the limited quantity of structural data. Meanwhile, protein language models trained on substantial 1D sequences have shown burgeoning capabilities with scale in a broad range of applications. Several preceding studies consider combining these different protein modalities to promote the representation power of geometric neural networks but fail to present a comprehensive understanding of their benefits. In this work, we integrate the knowledge learned by well-trained protein language models into several state-of-the-art geometric networks and evaluate a variety of protein representation learning benchmarks, including protein-protein interface prediction, model quality assessment, protein-protein rigid-body docking, and binding affinity prediction. Our findings show an overall improvement of 20% over baselines. Strong evidence indicates that the incorporation of protein language models' knowledge enhances geometric networks' capacity by a significant margin and can be generalized to complex tasks.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Efficient Data Learning for Open Information Extraction with Pre-trained Language Models
    Fan, Zhiyuan
    He, Shizhu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13056 - 13063
  • [32] Incident detection and classification in renewable energy news using pre-trained language models on deep neural networks
    Wang, Qiqing
    Li, Cunbin
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2022, 22 (01) : 57 - 76
  • [33] From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
    Xu, Weiwen
    Li, Xin
    Zhang, Wenxuan
    Zhou, Meng
    Lam, Wai
    Si, Luo
    Bing, Lidong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [34] Pre-trained models for natural language processing: A survey
    Qiu XiPeng
    Sun TianXiang
    Xu YiGe
    Shao YunFan
    Dai Ning
    Huang XuanJing
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897
  • [35] Probing Pre-Trained Language Models for Disease Knowledge
    Alghanmi, Israa
    Espinosa-Anke, Luis
    Schockaert, Steven
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3023 - 3033
  • [36] Analyzing Individual Neurons in Pre-trained Language Models
    Durrani, Nadir
    Sajjad, Hassan
    Dalvi, Fahim
    Belinkov, Yonatan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4865 - 4880
  • [37] Emotional Paraphrasing Using Pre-trained Language Models
    Casas, Jacky
    Torche, Samuel
    Daher, Karl
    Mugellini, Elena
    Abou Khaled, Omar
    2021 9TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2021,
  • [38] Dynamic Knowledge Distillation for Pre-trained Language Models
    Li, Lei
    Lin, Yankai
    Ren, Shuhuai
    Li, Peng
    Zhou, Jie
    Sun, Xu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 379 - 389
  • [39] Prompt Tuning for Discriminative Pre-trained Language Models
    Yao, Yuan
    Dong, Bowen
    Zhang, Ao
    Zhang, Zhengyan
    Xie, Ruobing
    Liu, Zhiyuan
    Lin, Leyu
    Sun, Maosong
    Wang, Jianyong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3468 - 3473
  • [40] Impact of Morphological Segmentation on Pre-trained Language Models
    Westhelle, Matheus
    Bencke, Luciana
    Moreira, Viviane P.
    INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 402 - 416