Exploring low-resource medical image classification with weakly supervised prompt learning

被引：2

作者：

Zheng, Fudan ^{[1
]}

Cao, Jindong ^{[1
]}

Yu, Weijiang ^{[2
]}

Chen, Zhiguang ^{[1
]}

Xiao, Nong ^{[1
]}

Lu, Yutong ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Guangzhou Higher Educ Mega Ctr, 132 Waihuandong Rd, Guangzhou 510006, Peoples R China

[2] Huawei Technol Co Ltd, Huawei Ind Pk, Shenzhen 518129, Peoples R China

来源：

PATTERN RECOGNITION | 2024年 / 149卷

关键词：

Medical image classification; Weakly supervised learning; Prompt learning; Few-shot learning; Zero-shot learning;

D O I：

10.1016/j.patcog.2024.110250

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most advances in medical image recognition supporting clinical auxiliary diagnosis meet challenges due to the low-resource situation in the medical field, where annotations are highly expensive and professional. This low-resource problem can be alleviated by leveraging the transferable representations of large-scale pre-trained vision-language models like CLIP. After being pre-trained using large-scale unlabeled medical images and texts (such as medical reports), the vision-language models can learn transferable representations and support flexible downstream clinical tasks such as medical image classification via relevant medical text prompts. However, existing pre-trained vision-language models require domain experts (clinicians) to carefully design the medical text prompts based on different datasets when applied to specific medical image tasks, which is extremely time-consuming and greatly increases the burden on clinicians. To address this problem, we propose a weakly supervised prompt learning method MedP rompt for automatically generating medical prompts, which includes an unsupervised pre-trained vision-language model and a weakly supervised prompt learning model. The unsupervised pre-trained vision-language model adopts large-scale medical images and texts for pretraining, utilizing the natural correlation between medical images and corresponding medical texts without manual annotations. The weakly supervised prompt learning model only utilizes the classes of images in the dataset to guide the learning of the specific class vector in the prompt, while the learning of other context vectors in the prompt does not require any manual annotations for guidance. To the best of our knowledge, this is the first model to automatically generate medical prompts. With the assistance of these prompts, the pre-trained vision-language model can be freed from the strong expert dependency of manual annotation and manual prompt design, thus achieving end -to -end, low-cost medical image classification. Experimental results show that the model using our automatically generated prompts outperforms all its hand-crafted prompts counterparts in full-shot learning on all four datasets, and achieves superior accuracy on zero-shot image classification and few-shot learning in three of the four medical benchmark datasets and comparable accuracy in the remaining one. In addition, the proposed prompt generator is lightweight and therefore has the potential to be embedded into any network architecture.

引用

页数：12

共 50 条

[1] EXPLORING SELF-SUPERVISED REPRESENTATION LEARNING FOR LOW-RESOURCE MEDICAL IMAGE ANALYSIS
Chattopadhyay, Soumitri
Ganguly, Soham
Chaudhury, Sreejit
Nag, Sayan
Chattopadhyay, Samiran
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1440 - 1444
[2] Large scale weakly and semi-supervised learning for low-resource video ASR
Singh, Kritika
Manohar, Vimal
Xiao, Alex
Edunov, Sergey
Girshick, Ross
Liptchinsky, Vitaliy
Fuegen, Christian
Saraf, Yatharth
Zweig, Geoffrey
Mohamed, Abdelrahman
[J]. INTERSPEECH 2020, 2020, : 3770 - 3774
[3] Weakly supervised scene text generation for low-resource languages
Xie, Yangchen
Chen, Xinyuan
Zhan, Hongjian
Shivakumara, Palaiahnakote
Yin, Bing
Liu, Cong
Lu, Yue
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[4] Prompt-based for Low-Resource Tibetan Text Classification
An, Bo
[J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (08)
[5] Hybrid multiple instance learning network for weakly supervised medical image classification and localization
[J]. Liang, Xiaokun (xk.liang@siat.ac.cn), 2025, 260
[6] Prompt Tuning on Graph-Augmented Low-Resource Text Classification
Wen, Zhihao
Fang, Yuan
[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36 (12) : 9080 - 9095
[7] Medical hyperspectral image classification based weakly supervised single-image global learning network
Zhang, Chenglong
Mou, Lichao
Shan, Shihao
Zhang, Hao
Qi, Yafei
Yu, Dexin
Zhu, Xiao Xiang
Sun, Nianzheng
Zheng, Xiangrong
Ma, Xiaopeng
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
[8] Weakly Supervised POS Taggers Perform Poorly on Truly Low-Resource Languages
Kann, Katharina
Lacroix, Ophelie
Sogaard, Anders
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8066 - 8073
[9] Exploring Multitask Learning for Low-Resource Abstractive Summarization
Magooda, Ahmed
Elaraby, Mohamed
Litman, Diane
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1652 - 1661
[10] Weakly Supervised Classification of Hyperspectral Image Based on Complementary Learning
Huang, Lingbo
Chen, Yushi
He, Xin
[J]. REMOTE SENSING, 2021, 13 (24)

← 1 2 3 4 5 →