Disease-Informed Adaptation of Vision-Language Models

被引:0
|
作者
Zhang, Jiajin [1 ,2 ]
Wang, Ge [1 ,2 ]
Kalra, Mannudeep K. [3 ]
Yan, Pingkun [1 ,2 ]
机构
[1] Rensselaer Polytech Inst, Dept Biomed Engn, Troy, NY 12181 USA
[2] Rensselaer Polytech Inst, Ctr Biotechnol & Interdisciplinary Studies, Troy, NY 12181 USA
[3] Harvard Med Sch, Massachusetts Gen Hosp, Dept Radiol, Boston, MA USA
基金
美国国家科学基金会;
关键词
VLM; Transfer Learning; Underrepresented/New Diseases;
D O I
10.1007/978-3-031-72120-5_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In medical image analysis, the expertise scarcity and the high cost of data annotation limits the development of large artificial intelligence models. This paper investigates the potential of transfer learning with pre-trained vision-language models (VLMs) in this domain. Currently, VLMs still struggle to transfer to the underrepresented diseases with minimal presence and new diseases entirely absent from the pre-training dataset. We argue that effective adaptation of VLMs hinges on the nuanced representation learning of disease concepts. By capitalizing on the joint visual-linguistic capabilities of VLMs, we introduce disease-informed contextual prompting in a novel disease prototype learning framework. This approach enables VLMs to grasp the concepts of new disease effectively and efficiently, even with limited data. Extensive experiments across multiple image modalities showcase notable enhancements in performance compared to existing techniques. The code will be made publicly available on our github page: https://github.com/RPIDIAL/Disease-informed-VLM-Adaptation.
引用
收藏
页码:232 / 242
页数:11
相关论文
共 50 条
  • [1] Few-Shot Adaptation of Medical Vision-Language Models
    Shakeri, Fereshteh
    Huang, Yunshi
    Silva-Rodriguez, Julio
    Bahig, Houda
    Tang, An
    Dolz, Jose
    Ben Ayed, Ismail
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII, 2024, 15012 : 553 - 563
  • [2] SwapPrompt: Test-Time Prompt Adaptation for Vision-Language Models
    Ma, Xiaosong
    Zhang, Jie
    Guo, Song
    Xu, Wenchao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] LiFT: Transfer Learning in Vision-Language Models for Downstream Adaptation and Generalization
    Li, Jingzheng
    Sun, Hailong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4678 - 4687
  • [4] Black Box Few-Shot Adaptation for Vision-Language models
    Ouali, Yassine
    Bulat, Adrian
    Matinez, Brais
    Tzimiropoulos, Georgios
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15488 - 15500
  • [5] Vision-Language Models for Vision Tasks: A Survey
    Zhang, Jingyi
    Huang, Jiaxing
    Jin, Sheng
    Lu, Shijian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (08) : 5625 - 5644
  • [6] Vision-Language Navigation Policy Learning and Adaptation
    Wang, Xin
    Huang, Qiuyuan
    Celikyilmaz, Asli
    Gao, Jianfeng
    Shen, Dinghan
    Wang, Yuan-Fang
    Wang, William Yang
    Zhang, Lei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4205 - 4216
  • [7] Learning to Prompt for Vision-Language Models
    Zhou, Kaiyang
    Yang, Jingkang
    Loy, Chen Change
    Liu, Ziwei
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (09) : 2337 - 2348
  • [8] Vision-Language Models for Biomedical Applications
    Thapa, Surendrabikram
    Naseem, Usman
    Zhou, Luping
    Kim, Jinman
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON VISION-LANGUAGE MODELS FOR BIOMEDICAL APPLICATIONS, VLM4BIO 2024, 2024, : 1 - 2
  • [9] Learning to Prompt for Vision-Language Models
    Kaiyang Zhou
    Jingkang Yang
    Chen Change Loy
    Ziwei Liu
    International Journal of Computer Vision, 2022, 130 : 2337 - 2348
  • [10] The Neglected Tails in Vision-Language Models
    Parashar, Shubham
    Lin, Zhiqiu
    Liu, Tian
    Dong, Xiangjue
    Li, Yanan
    Ramanan, Deva
    Caverlee, James
    Kong, Shu
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 12988 - 12997