MolFeSCue: enhancing molecular property prediction in data-limited and imbalanced contexts using few-shot and contrastive learning

被引:3
|
作者
Zhang, Ruochi [1 ,2 ]
Wu, Chao [1 ,3 ]
Yang, Qian [1 ,3 ]
Liu, Chang [4 ]
Wang, Yan [1 ,3 ]
Li, Kewei [1 ,3 ]
Huang, Lan [1 ,3 ]
Zhou, Fengfeng [1 ,3 ,5 ]
机构
[1] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Jilin, Peoples R China
[2] Jilin Univ, Sch Artificial Intelligence, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Jilin, Peoples R China
[4] Beijing Life Sci Acad, Beijing 102209, Peoples R China
[5] Guizhou Med Univ, Sch Biol & Engn, Guiyang 550025, Guizhou, Peoples R China
基金
中国国家自然科学基金;
关键词
CHALLENGES;
D O I
10.1093/bioinformatics/btae118
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Predicting molecular properties is a pivotal task in various scientific domains, including drug discovery, material science, and computational chemistry. This problem is often hindered by the lack of annotated data and imbalanced class distributions, which pose significant challenges in developing accurate and robust predictive models.Results This study tackles these issues by employing pretrained molecular models within a few-shot learning framework. A novel dynamic contrastive loss function is utilized to further improve model performance in the situation of class imbalance. The proposed MolFeSCue framework not only facilitates rapid generalization from minimal samples, but also employs a contrastive loss function to extract meaningful molecular representations from imbalanced datasets. Extensive evaluations and comparisons of MolFeSCue and state-of-the-art algorithms have been conducted on multiple benchmark datasets, and the experimental data demonstrate our algorithm's effectiveness in molecular representations and its broad applicability across various pretrained models. Our findings underscore MolFeSCues potential to accelerate advancements in drug discovery.Availability and implementation We have made all the source code utilized in this study publicly accessible via GitHub at http://www.healthinformaticslab.org/supp/ or https://github.com/zhangruochi/MolFeSCue. The code (MolFeSCue-v1-00) is also available as the supplementary file of this paper.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Task-Sequencing Meta Learning for Intelligent Few-Shot Fault Diagnosis With Limited Data
    Hu, Yidan
    Liu, Ruonan
    Li, Xianling
    Chen, Dongyue
    Hu, Qinghua
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (06) : 3894 - 3904
  • [42] Plant disease recognition in a low data scenario using few-shot learning
    Rezaei, Masoud
    Diepeveen, Dean
    Laga, Hamid
    Jones, Michael G. K.
    Sohel, Ferdous
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 219
  • [43] Pulse and Signal Data Classification Using Conventional and Few-Shot Machine Learning
    Lee, Kayla
    George, Kiran
    2022 IEEE WORLD AI IOT CONGRESS (AIIOT), 2022, : 311 - 317
  • [44] Enhancing Information Maximization With Distance-Aware Contrastive Learning for Source-Free Cross-Domain Few-Shot Learning
    Xu, Huali
    Liu, Li
    Zhi, Shuaifeng
    Fu, Shaojing
    Su, Zhuo
    Cheng, Ming-Ming
    Liu, Yongxiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2058 - 2073
  • [45] Identification of monocotyledons and dicotyledons leaves diseases with limited multi-category data by few-shot learning
    Jinchao Pan
    Qiufeng Wu
    Yiping Chen
    Yixin Guo
    Zhongkai Zhao
    Journal of Plant Diseases and Protection, 2022, 129 : 651 - 663
  • [46] A One-Dimensional Siamese Few-Shot Learning Approach for ECG Classification under Limited Data
    Li, Zongjin
    Wang, Huan
    Liu, Xinwen
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 455 - 458
  • [47] Identification of monocotyledons and dicotyledons leaves diseases with limited multi-category data by few-shot learning
    Pan, Jinchao
    Wu, Qiufeng
    Chen, Yiping
    Guo, Yixin
    Zhao, Zhongkai
    JOURNAL OF PLANT DISEASES AND PROTECTION, 2022, 129 (03) : 651 - 663
  • [48] A new few-shot learning model for runoff prediction: Demonstration in two data scarce regions
    Yang, Minghong
    Yang, Qinli
    Shao, Junming
    Wang, Guoqing
    Zhang, Wei
    ENVIRONMENTAL MODELLING & SOFTWARE, 2023, 162
  • [49] Enhancing Low-Cost Molecular Property Prediction with Contrastive Learning on SMILES Representations
    Quiles, Marcos G.
    Ribeiro, Piero A. L.
    Pinheiro, Gabriel A.
    Prati, Ronaldo C.
    da Silva, Juarez L. F.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT IX, 2024, 14823 : 387 - 401
  • [50] Few-Shot Learning Method for Continuous Prediction of Rock Mechanical Parameters Based on Logging Data
    Zhao, Weiguang
    Sang, Shuxun
    Han, Sijie
    Cheng, Deqiang
    Zhou, Xiaozhi
    Zhang, Jinchao
    Zhao, Fuping
    ACS OMEGA, 2024, 9 (47): : 47234 - 47247