In-context learning enables multimodal large language models to classify cancer pathology images

被引:0
|
作者
Dyke Ferber [1 ]
Georg Wölflein [2 ]
Isabella C. Wiest [3 ]
Marta Ligero [4 ]
Srividhya Sainath [3 ]
Narmin Ghaffari Laleh [5 ]
Omar S. M. El Nahhas [3 ]
Gustav Müller-Franzes [3 ]
Dirk Jäger [3 ]
Daniel Truhn [3 ]
Jakob Nikolas Kather [6 ]
机构
[1] Heidelberg University Hospital,National Center for Tumor Diseases (NCT)
[2] Heidelberg University Hospital,Department of Medical Oncology
[3] Technical University Dresden,Else Kroener Fresenius Center for Digital Health
[4] University of St Andrews,School of Computer Science
[5] Heidelberg University,Department of Medicine II, Medical Faculty Mannheim
[6] University Hospital Aachen,Department of Diagnostic and Interventional Radiology
[7] University Hospital Dresden,Department of Medicine I
关键词
D O I
10.1038/s41467-024-51465-9
中图分类号
学科分类号
摘要
Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples. In summary, this study demonstrates that large vision language models trained on non-domain specific data can be applied out-of-the box to solve medical image-processing tasks in histopathology. This democratizes access of generalist AI models to medical experts without technical background especially for areas where annotated data is scarce.
引用
收藏
相关论文
共 50 条
  • [41] Context-aware learning for cancer cell nucleus recognition in pathology images
    Bai, Tian
    Xu, Jiayu
    Zhang, Zhenting
    Guo, Shuyu
    Luo, Xiao
    BIOINFORMATICS, 2022, 38 (10) : 2892 - 2898
  • [42] The application of multimodal large language models in medicine
    Qiu, Jianing
    Yuan, Wu
    Lam, Kyle
    LANCET REGIONAL HEALTH-WESTERN PACIFIC, 2024, 45
  • [43] Visual cognition in multimodal large language models
    Luca M. Schulze Buschoff
    Elif Akata
    Matthias Bethge
    Eric Schulz
    Nature Machine Intelligence, 2025, 7 (1) : 96 - 106
  • [44] Multimodal large language models for bioimage analysis
    Zhang, Shanghang
    Dai, Gaole
    Huang, Tiejun
    Chen, Jianxu
    NATURE METHODS, 2024, 21 (08) : 1390 - 1393
  • [45] Extracting Business Process Entities and Relations from Text Using Pre-trained Language Models and In-Context Learning
    Bellan, Patrizio
    Dragoni, Mauro
    Ghidini, Chiara
    ENTERPRISE DESIGN, OPERATIONS, AND COMPUTING, EDOC 2022, 2022, 13585 : 182 - 199
  • [46] Using General Large Language Models to Classify Mathematical Documents
    Ion, Patrick D. F.
    Watt, Stephen M.
    INTELLIGENT COMPUTER MATHEMATICS, CICM 2024, 2024, 14690 : 42 - 57
  • [47] Using case-level context to classify cancer pathology reports
    Gao, Shang
    Alawad, Mohammed
    Schaefferkoetter, Noah
    Penberthy, Lynne
    Wu, Xiao-Cheng
    Durbin, Eric B.
    Coyle, Linda
    Ramanathan, Arvind
    Tourassi, Georgia
    PLOS ONE, 2020, 15 (05):
  • [48] Images are Achilles’ Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models
    Li, Yifan
    Guo, Hangyu
    Zhou, Kun
    Zhao, Wayne Xin
    Wen, Ji-Rong
    arXiv,
  • [49] Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models
    Chen, Zheyi
    Xu, Liuchang
    Zheng, Hongting
    Chen, Luyao
    Tolba, Amr
    Zhao, Liang
    Yu, Keping
    Feng, Hailin
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (02): : 1753 - 1808
  • [50] Large language models and multimodal foundation models for precision oncology
    Truhn, Daniel
    Eckardt, Jan-Niklas
    Ferber, Dyke
    Kather, Jakob Nikolas
    NPJ PRECISION ONCOLOGY, 2024, 8 (01)