MRM: Masked Relation Modeling for Medical Image Pre-Training with Genetics

被引:0
|
作者
Yang, Qiushi [1 ]
Li, Wuyang [1 ]
Li, Baopu
Yuan, Yixuan [1 ,2 ]
机构
[1] City Univ Hong Kong, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1109/ICCV51070.2023.01961
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern deep learning techniques on automatic multi-modal medical diagnosis rely on massive expert annotations, which is time-consuming and prohibitive. Recent masked image modeling (MIM)-based pre-training methods have witnessed impressive advances for learning meaningful representations from unlabeled data and transferring to downstream tasks. However, these methods focus on natural images and ignore the specific properties of medical data, yielding unsatisfying generalization performance on downstream medical diagnosis. In this paper, we aim to leverage genetics to boost image pre-training and present a masked relation modeling (MRM) framework. Instead of explicitly masking input data in previous MIM methods leading to loss of disease-related semantics, we design relation masking to mask out token-wise feature relation in both self- and cross-modality levels, which preserves intact semantics within the input and allows the model to learn rich disease-related information. Moreover, to enhance semantic relation modeling, we propose relation matching to align the sample-wise relation between the intact and masked features. The relation matching exploits inter-sample relation by encouraging global constraints in the feature space to render sufficient semantic relation for feature representation. Extensive experiments demonstrate that the proposed framework is simple yet powerful, achieving state-of-the-art transfer performance on various downstream diagnosis tasks. Codes are available at https://github. com/ CityU-AIM-Group/MRM.
引用
收藏
页码:21395 / 21405
页数:11
相关论文
共 50 条
  • [21] RePreM: Representation Pre-training with Masked Model for Reinforcement Learning
    Cai, Yuanying
    Zhang, Chuheng
    Shen, Wei
    Zhang, Xuyun
    Ruan, Wenjie
    Huang, Longbo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6879 - 6887
  • [22] Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
    Li, Pengfei
    Liu, Gang
    He, Jinlong
    Zhao, Zixu
    Zhong, Shenjun
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT I, 2023, 14220 : 374 - 383
  • [23] RELATION ENHANCED VISION LANGUAGE PRE-TRAINING
    Lee, Ju-Hee
    Kang, Je-Won
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2286 - 2290
  • [24] In Defense of Image Pre-Training for Spatiotemporal Recognition
    Li, Xianhang
    Wang, Huiyu
    Wei, Chen
    Mei, Jieru
    Yuille, Alan
    Zhou, Yuyin
    Xie, Cihang
    COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 675 - 691
  • [25] Sigmoid Loss for Language Image Pre-Training
    Zhai, Xiaohua
    Mustafa, Basil
    Kolesnikov, Alexander
    Beyer, Lucas
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11941 - 11952
  • [26] Grounded Language-Image Pre-training
    Li, Liunian Harold
    Zhang, Pengchuan
    Zhang, Haotian
    Yang, Jianwei
    Li, Chunyuan
    Zhong, Yiwu
    Wang, Lijuan
    Yuan, Lu
    Zhang, Lei
    Hwang, Jenq-Neng
    Chang, Kai-Wei
    Gao, Jianfeng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10955 - 10965
  • [27] <italic>Uni4Eye++</italic>: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation
    Cai Z.
    Lin L.
    He H.
    Cheng P.
    Tang X.
    IEEE Transactions on Medical Imaging, 2024, 43 (12) : 1 - 1
  • [28] Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
    Wang, Chengyi
    Wang, Yiming
    Wu, Yu
    Chen, Sanyuan
    Li, Jinyu
    Liu, Shujie
    Wei, Furu
    INTERSPEECH 2022, 2022, : 2643 - 2647
  • [29] Universal Conditional Masked Language Pre-training for Neural Machine Translation
    Li, Pengfei
    Li, Liangyou
    Zhang, Meng
    Wu, Minghao
    Liu, Qun
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6379 - 6391
  • [30] UNSUPERVISED PRE-TRAINING OF BIDIRECTIONAL SPEECH ENCODERS VIA MASKED RECONSTRUCTION
    Wang, Weiran
    Tang, Qingming
    Livescu, Karen
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6889 - 6893