Multi-modal molecule structure–text model for text-based retrieval and editing

被引:0
|
作者
Shengchao Liu
Weili Nie
Chengpeng Wang
Jiarui Lu
Zhuoran Qiao
Ling Liu
Jian Tang
Chaowei Xiao
Animashree Anandkumar
机构
[1] Mila-Québec Artificial Intelligence Institute,
[2] Université de Montréal,undefined
[3] NVIDIA Research,undefined
[4] University of Illinois Urbana-Champaign,undefined
[5] California Institute of Technology,undefined
[6] Princeton University,undefined
[7] HEC Montréal,undefined
[8] Arizona State University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
There is increasing adoption of artificial intelligence in drug discovery. However, existing studies use machine learning to mainly utilize the chemical structures of molecules but ignore the vast textual knowledge available in chemistry. Incorporating textual knowledge enables us to realize new drug design objectives, adapt to text-based instructions and predict complex biological activities. Here we present a multi-modal molecule structure–text model, MoleculeSTM, by jointly learning molecules’ chemical structures and textual descriptions via a contrastive learning strategy. To train MoleculeSTM, we construct a large multi-modal dataset, namely, PubChemSTM, with over 280,000 chemical structure–text pairs. To demonstrate the effectiveness and utility of MoleculeSTM, we design two challenging zero-shot tasks based on text instructions, including structure–text retrieval and molecule editing. MoleculeSTM has two main properties: open vocabulary and compositionality via natural language. In experiments, MoleculeSTM obtains the state-of-the-art generalization ability to novel biochemical concepts across various benchmarks.
引用
收藏
页码:1447 / 1457
页数:10
相关论文
共 50 条
  • [41] Sentiment Classification Algorithm Based on Multi-Modal Social Media Text Information
    Xuanyuan, Minzheng
    Xiao, Le
    Duan, Mengshi
    [J]. IEEE ACCESS, 2021, 9 : 33410 - 33418
  • [42] Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems
    Pereira, Jose Costa
    Vasconcelos, Nuno
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 124 : 123 - 135
  • [43] Multi-modal graph reasoning for structured video text extraction
    Shi, Weitao
    Wang, Han
    Lou, Xin
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2023, 107
  • [44] On the Effectiveness of Images in Multi-modal Text Classification: An Annotation Study
    Ma, Chunpeng
    Shen, Aili
    Yoshikawa, Hiyori
    Iwakura, Tomoya
    Beck, Daniel
    Baldwin, Timothy
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (03)
  • [45] Multi-modal text recognition and encryption in scanned document images
    Maemoona Kayani
    Abdul Ghafoor
    M. Mohsin Riaz
    [J]. The Journal of Supercomputing, 2023, 79 : 7916 - 7936
  • [46] Multi-modal text recognition and encryption in scanned document images
    Kayani, Maemoona
    Ghafoor, Abdul
    Riaz, M. Mohsin
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (07): : 7916 - 7936
  • [47] More than Text: Multi-modal Chinese Word Segmentation
    Zhang, Dong
    Hu, Zheng
    Li, Shoushan
    Wu, Hanqian
    Zhu, Qiaoming
    Zhou, Guodong
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 550 - 557
  • [48] COREN: Multi-Modal Co-Occurrence Transformer Reasoning Network for Image-Text Retrieval
    Wang, Yaodong
    Ji, Zhong
    Chen, Kexin
    Pang, Yanwei
    Zhang, Zhongfei
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (05) : 5959 - 5978
  • [49] An Image-Text Matching Method for Multi-Modal Robots
    Zheng, Ke
    Li, Zhou
    [J]. JOURNAL OF ORGANIZATIONAL AND END USER COMPUTING, 2024, 36 (01)
  • [50] COREN: Multi-Modal Co-Occurrence Transformer Reasoning Network for Image-Text Retrieval
    Yaodong Wang
    Zhong Ji
    Kexin Chen
    Yanwei Pang
    Zhongfei Zhang
    [J]. Neural Processing Letters, 2023, 55 : 5959 - 5978