Learning Invariant Molecular Representation in Latent Discrete Space

被引:0
|
作者
Zhuang, Xiang [1 ,2 ,3 ]
Zhang, Qiang [1 ,2 ,3 ]
Ding, Keyan [2 ]
Bian, Yatao [4 ]
Wang, Xiao [5 ]
Lv, Jingsong [6 ]
Chen, Hongyang [6 ]
Chen, Huajun [1 ,2 ,3 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] ZJU Hangzhou Global Sci & Technol Innovat Ctr, Hangzhou, Peoples R China
[3] Zhejiang Univ Ant Grp Joint Lab Knowledge Graph, Hangzhou, Peoples R China
[4] Tencent AI Lab, Shenzhen, Peoples R China
[5] Beihang Univ, Sch Software, Beijing, Peoples R China
[6] Zhejiang Lab, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
DESIGN;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Molecular representation learning lays the foundation for drug discovery. However, existing methods suffer from poor out-of-distribution (OOD) generalization, particularly when data for training and testing originate from different environments. To address this issue, we propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts. Specifically, we propose a strategy called "first-encoding-then-separation" to identify invariant molecule features in the latent space, which deviates from conventional practices. Prior to the separation step, we introduce a residual vector quantization module that mitigates the over-fitting to training data distributions while preserving the expressivity of encoders. Furthermore, we design a task-agnostic self-supervised learning objective to encourage precise invariance identification, which enables our method widely applicable to a variety of tasks, such as regression and multi-label classification. Extensive experiments on 18 real-world molecular datasets demonstrate that our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts. Our code is available at https://github.com/HICAI-ZJU/iMoLD.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Dual Space Latent Representation Learning for Image Representation
    Huang, Yulei
    Ma, Ziping
    Li, Huirong
    Wang, Jingyu
    MATHEMATICS, 2023, 11 (11)
  • [2] DeepMDP: Learning Continuous Latent Space Models for Representation Learning
    Gelada, Carles
    Kumar, Saurabh
    Buckman, Jacob
    Nachum, Ofir
    Bellemare, Marc G.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [3] A COMPARISON OF DISCRETE LATENT VARIABLE MODELS FOR SPEECH REPRESENTATION LEARNING
    Zhou, Henry
    Baevski, Alexei
    Auli, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3050 - 3054
  • [4] Dual space latent representation learning for unsupervised feature selection
    Shang, Ronghua
    Wang, Lujuan
    Shang, Fanhua
    Jiao, Licheng
    Li, Yangyang
    PATTERN RECOGNITION, 2021, 114
  • [5] Multiview Clustering via Proximity Learning in Latent Representation Space
    Liu, Bao-Yu
    Huang, Ling
    Wang, Chang-Dong
    Lai, Jian-Huang
    Yu, Philip S.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (02) : 973 - 986
  • [6] Invariant Node Representation Learning under Distribution Shifts with Multiple Latent Environments
    Li, Haoyang
    Zhang, Ziwei
    Wang, Xin
    Zhu, Wenwu
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (01)
  • [7] A deep latent space model for interpretable representation learning on directed graphs
    Yang, Hanxuan
    Kong, Qingchao
    Mao, Wenji
    NEUROCOMPUTING, 2024, 576
  • [8] Representation learning for social networks using Homophily based Latent Space Model
    Nerurkar, Pranav
    Chandane, Madhav
    Bhirud, Sunil
    INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS (COINS), 2019, : 38 - 43
  • [9] Representation of invariant subspaces of the Schwartz space
    Abuzyarova, Natal'ya F.
    SBORNIK MATHEMATICS, 2022, 213 (08) : 1020 - 1040
  • [10] Invariant representation and matching of space curves
    Lo, Chong-Huah, 2000, Kluwer Academic Publishers, Dordrecht (28):