SMICLR: Contrastive Learning on Multiple Molecular Representations for Semisupervised and Unsupervised Representation Learning

被引:17
|
作者
Pinheiro, Gabriel A. [1 ]
Silva, Juarez L. F. [2 ]
Quiles, Marcos G. [1 ]
机构
[1] Fed Univ Sao Paulo Unifesp, Inst Sci & Technol, BR-12247014 Sao Jose Dos Campos, SP, Brazil
[2] Univ Sao Paulo, Sao Carlos Inst Chem, BR-13560970 Sao Carlos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
PREDICTION; NETWORKS; LANGUAGE; MODELS; SMILES;
D O I
10.1021/acs.jcim.2c00521
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Machine learning as a tool for chemical space exploration broadens horizons to work with known and unknown molecules. At its core lies molecular representation, an essential key to improve learning about structure-property relationships. Recently, contrastive frameworks have been showing impressive results for representation learning in diverse domains. Therefore, this paper proposes a contrastive framework that embraces multimodal molecular data. Specifically, our approach jointly trains a graph encoder and an encoder for the simplified molecular-input line-entry system (SMILES) string to perform the contrastive learning objective. Since SMILES is the basis of our method, i.e., we built the molecular graph from the SMILES, we call our framework as SMILES Contrastive Learning (SMICLR). When stacking a nonlinear regressor on the SMICLR's pretrained encoder and fine-tuning the entire model, we reduced the prediction error by, on average, 44% and 25% for the energetic and electronic properties of the QM9 data set, respectively, over the supervised baseline. We further improved our framework's performance when applying data augmentations in each molecular-input representation. Moreover, SMICLR demonstrated competitive representation learning results in an unsupervised setting.
引用
收藏
页码:3948 / 3960
页数:13
相关论文
共 50 条
  • [21] On Learning Contrastive Representations for Learning with Noisy Labels
    Yi, Li
    Liu, Sheng
    She, Qi
    McLeod, A. Ian
    Wang, Boyu
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16661 - 16670
  • [22] Multiple Kernel Sparse Representations for Supervised and Unsupervised Learning
    Thiagarajan, Jayaraman J.
    Ramamurthy, Karthikeyan Natesan
    Spanias, Andreas
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (07) : 2905 - 2915
  • [23] Semisupervised Graph Contrastive Learning for Process Fault Diagnosis
    Jia, Mingwei
    Yang, Chao
    Liu, Qiang
    Gao, Zengliang
    Liu, Yi
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2024, 63 (33) : 14712 - 14726
  • [24] Molecular contrastive learning of representations via graph neural networks
    Yuyang Wang
    Jianren Wang
    Zhonglin Cao
    Amir Barati Farimani
    Nature Machine Intelligence, 2022, 4 : 279 - 287
  • [25] Molecular contrastive learning of representations via graph neural networks
    Wang, Yuyang
    Wang, Jianren
    Cao, Zhonglin
    Farimani, Amir Barati
    NATURE MACHINE INTELLIGENCE, 2022, 4 (03) : 279 - 287
  • [26] Unsupervised Representation for Semantic Segmentation by Implicit Cycle-Attention Contrastive Learning
    Pang, Bo
    Li, Yizhuo
    Zhang, Yifan
    Peng, Gao
    Tang, Jiajun
    Zha, Kaiwen
    Li, Jiefeng
    Lu, Cewu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2044 - 2052
  • [27] Contrastive Positive Mining for Unsupervised 3D Action Representation Learning
    Zhang, Haoyuan
    Hou, Yonghong
    Zhang, Wenjing
    Li, Wanqing
    COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 36 - 51
  • [28] Mutual Contrastive Learning for Visual Representation Learning
    Yang, Chuanguang
    An, Zhulin
    Cai, Linhang
    Xu, Yongjun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3045 - 3053
  • [29] Masked Contrastive Representation Learning for Reinforcement Learning
    Zhu, Jinhua
    Xia, Yingce
    Wu, Lijun
    Deng, Jiajun
    Zhou, Wengang
    Qin, Tao
    Liu, Tie-Yan
    Li, Houqiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3421 - 3433
  • [30] Semisupervised learning for molecular profiling
    Furlanello, C
    Serafini, M
    Merler, S
    Jurman, G
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2005, 2 (02) : 110 - 118