Aligning Medical Images with General Knowledge from Large Language Models

被引：0

作者：

Fang, Xiao ^{[1
]}

Lin, Yi ^{[1
]}

Zhang, Dong ^{[2
]}

Cheng, Kwang-Ting ^{[2
]}

Chen, Hao ^{[1
,3
,4
]}

机构：

[1] HKUST, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[2] HKUST, Dept Elect & Comp Engn, Hong Kong, Peoples R China

[3] HKUST, Dept Chem & Biol Engn, Hong Kong, Peoples R China

[4] HKUST Shenzhen Hong Kong Collaborat Innovat Res I, Shenzhen, Peoples R China

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT X | 2024年 / 15010卷

关键词：

Prompt Learning; Vision-Language Models; Large Language Model; Medical Image Analysis;

D O I：

10.1007/978-3-031-72117-5_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained large vision-language models (VLMs) like CLIP have revolutionized visual representation learning using natural language as supervisions, and demonstrated promising generalization ability. In this work, we propose ViP, a novel visual symptom-guided prompt learning framework for medical image analysis, which facilitates general knowledge transfer from CLIP. ViP consists of two key components: a visual symptom generator (VSG) and a dual-prompt network. Specifically, VSG aims to extract explicable visual symptoms from pre-trained large language models, while the dual-prompt network utilizes these visual symptoms to guide the training on two learnable prompt modules, i.e., context prompt and merge prompt, which effectively adapts our framework to medical image analysis via large VLMs. Extensive experimental results demonstrate that ViP can outperform state-of-the-art methods on two challenging datasets. The code is available at https://github.com/xiaofang007/ViP.

引用

页码：57 / 67

页数：11

共 50 条

[31] The role of large language models in medical genetics
Merdler-Rabinowicz, Rona
Omar, Mahmud
Ganesh, Jaya
Morava, Eva
Nadkarni, Girish N.
Klang, Eyal
MOLECULAR GENETICS AND METABOLISM, 2025, 145 (01)
[32] Large Language Models and Their Implications on Medical Education
Bair, Henry
Norden, Justin
ACADEMIC MEDICINE, 2023, 98 (08) : 869 - 870
[33] Probabilistic medical predictions of large language models
Gu, Bowen
Desai, Rishi J.
Lin, Kueiyu Joshua
Yang, Jie
NPJ DIGITAL MEDICINE, 2024, 7 (01):
[34] Large language models as partners in medical literature
Perez-Guerrero, Eduardo J.
Mehrotra, Isha
Jain, Sneha S.
V. Perez, Marco
HEART RHYTHM, 2025, 22 (02) : 579 - 584
[35] Limitations of large language models in medical applications
Deng, Jiawen
Zubair, Areeba
Park, Ye-Jean
POSTGRADUATE MEDICAL JOURNAL, 2023, 99 (1178) : 1298 - 1299
[36] KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models
Kim, Jiho
Kwon, Yeonsu
Jo, Yohan
Choi, Edward
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9410 - 9421
[37] Large Language Models Answer Medical Questions Accurately, but Can't Match Clinicians' Knowledge
Harris, Emily
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2023, 330 (09): : 792 - 794
[38] Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation
Messina, Pablo
Vidal, Rene
Parra, Denis
Soto, Alvaro
Araujo, Vladimir
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3955 - 3986
[39] Aligning Language Models to User Opinions
Hwang, EunJeong
Majumder, Bodhisattwa Prasad
Tandon, Niket
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5906 - 5919
[40] Leveraging Medical Knowledge Graphs and Large Language Models for Enhanced Mental Disorder Information Extraction
Park, Chaelim
Lee, Hayoung
Jeong, Ok-ran
FUTURE INTERNET, 2024, 16 (08)

← 1 2 3 4 5 →