Learning Substructure Invariance for Out-of-Distribution Molecular Representations

被引:0
|
作者
Yang, Nianzu [1 ]
Zeng, Kaipeng [1 ]
Wu, Qitian [1 ]
Jia, Xiaosong [1 ]
Yan, Junchi [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
HIV-1; INTEGRASE; IDENTIFICATION; DRUGS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Molecule representation learning (MRL) has been extensively studied and current methods have shown promising power for various tasks, e.g., molecular property prediction and target identification. However, a common hypothesis of existing methods is that either the model development or experimental evaluation is mostly based on i.i.d. data across training and testing. Such a hypothesis can be violated in real-world applications where testing molecules could come from new environments, bringing about serious performance degradation or unexpected prediction. We propose a new representation learning framework entitled MoleOOD to enhance the robustness of MRL models against such distribution shifts, motivated by an observation that the (bio)chemical properties of molecules are usually invariantly associated with certain privileged molecular substructures across different environments (e.g., scaffolds, sizes, etc.). Specifically, We introduce an environment inference model to identify the latent factors that impact data generation from different distributions in a fully data-driven manner. We also propose a new learning objective to guide the molecule encoder to leverage environment-invariant substructures that more stably relate with the labels across environments. Extensive experiments on ten real-world datasets demonstrate that our model has a stronger generalization ability than existing methods under various out-of-distribution (OOD) settings, despite the absence of manual specifications of environments. Particularly, our method achieves up to 5.9% and 3.9% improvement over the strongest baselines on OGB and DrugOOD benchmarks in terms of ROC-AUC, respectively. Our source code is publicly available at https://github.com/yangnianzu0515/MoleOOD.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Causal Representation Learning for Out-of-Distribution Recommendation
    Wang, Wenjie
    Lin, Xinyu
    Feng, Fuli
    He, Xiangnan
    Lin, Min
    Chua, Tat-Seng
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 3562 - 3571
  • [22] Out-Of-Distribution Detection In Unsupervised Continual Learning
    He, Jiangpeng
    Zhu, Fengqing
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3849 - 3854
  • [23] Deep Stable Learning for Out-Of-Distribution Generalization
    Zhang, Xingxuan
    Cui, Peng
    Xu, Renzhe
    Zhou, Linjun
    He, Yue
    Shen, Zheyan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5368 - 5378
  • [24] Out-of-distribution generalization for learning quantum dynamics
    Matthias C. Caro
    Hsin-Yuan Huang
    Nicholas Ezzell
    Joe Gibbs
    Andrew T. Sornborger
    Lukasz Cincio
    Patrick J. Coles
    Zoë Holmes
    Nature Communications, 14
  • [25] Learning Modular Structures That Generalize Out-of-Distribution
    Ashok, Arjun
    Devaguptapu, Chaitanya
    Balasubramanian, Vineeth N.
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12905 - 12906
  • [26] CausPref: Causal Preference Learning for Out-of-Distribution Recommendation
    He, Yue
    Wang, Zimu
    Cui, Peng
    Zou, Hao
    Zhang, Yafeng
    Cui, Qiang
    Jiang, Yong
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 410 - 421
  • [27] Verifying the Generalization of Deep Learning to Out-of-Distribution Domains
    Amir, Guy
    Maayan, Osher
    Zelazny, Tom
    Katz, Guy
    Schapira, Michael
    JOURNAL OF AUTOMATED REASONING, 2024, 68 (03)
  • [28] Model Agnostic Sample Reweighting for Out-of-Distribution Learning
    Zhou, Xiao
    Lin, Yong
    Pi, Renjie
    Zhang, Weizhong
    Xu, Renzhe
    Cui, Peng
    Zhang, Tong
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [29] Probing out-of-distribution generalization in machine learning for materials
    Li, Kangming
    Rubungo, Andre Niyongabo
    Lei, Xiangyun
    Persaud, Daniel
    Choudhary, Kamal
    Decost, Brian
    Dieng, Adji Bousso
    Hattrick-Simpers, Jason
    COMMUNICATIONS MATERIALS, 2025, 6 (01)
  • [30] Understanding and Improving Feature Learning for Out-of-Distribution Generalization
    Chen, Yongqiang
    Huang, Wei
    Zhou, Kaiwen
    Bian, Yatao
    Han, Bo
    Cheng, James
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,