Multi-modal molecule structure–text model for text-based retrieval and editing

被引:0
|
作者
Shengchao Liu
Weili Nie
Chengpeng Wang
Jiarui Lu
Zhuoran Qiao
Ling Liu
Jian Tang
Chaowei Xiao
Animashree Anandkumar
机构
[1] Mila-Québec Artificial Intelligence Institute,
[2] Université de Montréal,undefined
[3] NVIDIA Research,undefined
[4] University of Illinois Urbana-Champaign,undefined
[5] California Institute of Technology,undefined
[6] Princeton University,undefined
[7] HEC Montréal,undefined
[8] Arizona State University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
There is increasing adoption of artificial intelligence in drug discovery. However, existing studies use machine learning to mainly utilize the chemical structures of molecules but ignore the vast textual knowledge available in chemistry. Incorporating textual knowledge enables us to realize new drug design objectives, adapt to text-based instructions and predict complex biological activities. Here we present a multi-modal molecule structure–text model, MoleculeSTM, by jointly learning molecules’ chemical structures and textual descriptions via a contrastive learning strategy. To train MoleculeSTM, we construct a large multi-modal dataset, namely, PubChemSTM, with over 280,000 chemical structure–text pairs. To demonstrate the effectiveness and utility of MoleculeSTM, we design two challenging zero-shot tasks based on text instructions, including structure–text retrieval and molecule editing. MoleculeSTM has two main properties: open vocabulary and compositionality via natural language. In experiments, MoleculeSTM obtains the state-of-the-art generalization ability to novel biochemical concepts across various benchmarks.
引用
收藏
页码:1447 / 1457
页数:10
相关论文
共 50 条
  • [31] StrucTexT: Structured Text Understanding with Multi-Modal Transformers
    Li, Yulin
    Qian, Yuxi
    Yu, Yuechen
    Qin, Xiameng
    Zhang, Chenquan
    Liu, Yan
    Yao, Kun
    Han, Junyu
    Liu, Jingtuo
    Ding, Errui
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1912 - 1920
  • [32] Image and Encoded Text Fusion for Multi-Modal Classification
    Gallo, I.
    Calefati, A.
    Nawaz, S.
    Janjua, M. K.
    2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2018, : 203 - 209
  • [33] Multi-modal Visualization and Search for Text and Prosody Annotations
    Gaertner, Markus
    Schweitzer, Katrin
    Eckart, Kerstin
    Kuhn, Jonas
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2015): SYSTEM DEMONSTRATIONS, 2015, : 25 - 30
  • [34] VTLayout: A Multi-Modal Approach for Video Text Layout
    Zhao, Yuxuan
    Ma, Jin
    Qi, Zhongang
    Xie, Zehua
    Luo, Yu
    Kang, Qiusheng
    Shan, Ying
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2775 - 2784
  • [35] Image Sense Classification in Text-Based Image Retrieval
    Chang, Yih-Chen
    Chen, Hsin-Hsi
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2009, 5839 : 124 - 135
  • [36] External query reformulation for text-based image retrieval
    Min, Jinming
    Jones, Gareth J. F.
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011, 7024 LNCS : 249 - 260
  • [37] External Query Reformulation for Text-Based Image Retrieval
    Min, Jinming
    Jones, Gareth J. F.
    STRING PROCESSING AND INFORMATION RETRIEVAL, 2011, 7024 : 249 - 260
  • [38] Implementation and Comparison of Text-Based Image Retrieval Schemes
    Zaidi, Syed Ali Jafar
    Buriro, Attaullah
    Riaz, Mohammad
    Mahoob, Athar
    Riaz, Mohammad Noman
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (01) : 611 - 618
  • [39] Imagic: Text-Based Real Image Editing with Diffusion Models
    Kawar, Bahjat
    Zada, Shiran
    Lang, Oran
    Tov, Omer
    Chang, Huiwen
    Dekel, Tali
    Mosseri, Inbar
    Irani, Michal
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6007 - 6017
  • [40] Multi-scale Multi-modal Dictionary BERT For Effective Text-image Retrieval in Multimedia Advertising
    Yu, Tan
    Liu, Jie
    Jin, Zhipeng
    Yang, Yi
    Fei, Hongliang
    Li, Ping
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4655 - 4660