Knowledge Boosting: Rethinking Medical Contrastive Vision-Language Pre-training

被引:0
|
作者
Chen, Xiaofei [1 ]
He, Yuting [1 ]
Xue, Cheng [1 ]
Ge, Rongjun [2 ]
Li, Shuo [3 ]
Yang, Guanyu [1 ,4 ,5 ]
机构
[1] Southeast Univ, Minist Educ, Key Lab New Generat Artificial Intelligence Techn, Dhaka, Bangladesh
[2] Nanjing Univ Aeronaut & Astronaut, Nanjing, Peoples R China
[3] Case Western Reserve Univ, Dept Biomed Engn, Cleveland, OH 44106 USA
[4] Southeast Univ, Joint Int Res Lab Med Informat Proc, Nanjing 210096, Peoples R China
[5] Ctr Rech Informat Biomed Sino Francais CRIBs, Nanjing, Peoples R China
关键词
D O I
10.1007/978-3-031-43907-0_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The foundation models based on pre-training technology have significantly advanced artificial intelligence from theoretical to practical applications. These models have facilitated the feasibility of computer-aided diagnosis for widespread use. Medical contrastive vision-language pre-training, which does not require human annotations, is an effective approach for guiding representation learning using description information in diagnostic reports. However, the effectiveness of pre-training is limited by the large-scale semantic overlap and shifting problems in medical field. To address these issues, we propose the Knowledge-Boosting Contrastive Vision-Language Pre-training framework (KoBo), which integrates clinical knowledge into the learning of vision-language semantic consistency. The framework uses an unbiased, open-set sample-wise knowledge representation to measure negative sample noise and supplement the correspondence between vision-language mutual information and clinical knowledge. Extensive experiments validate the effect of our framework on eight tasks including classification, segmentation, retrieval, and semantic relatedness, achieving comparable or better performance with the zero-shot or few-shot settings. Our code is open on https://github.com/ChenXiaoFei-CS/KoBo.
引用
收藏
页码:405 / 415
页数:11
相关论文
共 50 条
  • [1] Contrastive Vision-Language Pre-training with Limited Resources
    Cui, Quan
    Zhou, Boyan
    Guo, Yu
    Yin, Weidong
    Wu, Hao
    Yoshie, Osamu
    Chen, Yubo
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 236 - 253
  • [2] Vision-Language Pre-Training with Triple Contrastive Learning
    Yang, Jinyu
    Duan, Jiali
    Tran, Son
    Xu, Yi
    Chanda, Sampath
    Chen, Liqun
    Zeng, Belinda
    Chilimbi, Trishul
    Huang, Junzhou
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15650 - 15659
  • [3] Vision-Language Pre-Training for Boosting Scene Text Detectors
    Song, Sibo
    Wan, Jianqiang
    Yang, Zhibo
    Tang, Jun
    Cheng, Wenqing
    Bai, Xiang
    Yao, Cong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15660 - 15670
  • [4] COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation
    Wen, Keyu
    Xia, Jin
    Huang, Yuanyuan
    Li, Linyang
    Xu, Jiayan
    Shao, Jie
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2188 - 2197
  • [5] Survey on Vision-language Pre-training
    Yin, Jiong
    Zhang, Zhe-Dong
    Gao, Yu-Han
    Yang, Zhi-Wen
    Li, Liang
    Xiao, Mang
    Sun, Yao-Qi
    Yan, Cheng-Gang
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2000 - 2023
  • [6] VLP: A Survey on Vision-language Pre-training
    Chen, Fei-Long
    Zhang, Du-Zhen
    Han, Ming-Lun
    Chen, Xiu-Yi
    Shi, Jing
    Xu, Shuang
    Xu, Bo
    [J]. MACHINE INTELLIGENCE RESEARCH, 2023, 20 (01) : 38 - 56
  • [7] VLP: A Survey on Vision-language Pre-training
    Fei-Long Chen
    Du-Zhen Zhang
    Ming-Lun Han
    Xiu-Yi Chen
    Jing Shi
    Shuang Xu
    Bo Xu
    [J]. Machine Intelligence Research, 2023, 20 (01) : 38 - 56
  • [8] VLP: A Survey on Vision-language Pre-training
    Fei-Long Chen
    Du-Zhen Zhang
    Ming-Lun Han
    Xiu-Yi Chen
    Jing Shi
    Shuang Xu
    Bo Xu
    [J]. Machine Intelligence Research, 2023, 20 : 38 - 56
  • [9] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
    Jian, Yiren
    Gao, Chongyang
    Vosoughi, Soroush
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Pre-training A Prompt Pool for Vision-Language Model
    Liu, Jun
    Gu, Yang
    Yang, Zhaohua
    Guo, Shuai
    Liu, Huaqiu
    Chen, Yiqiang
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,