Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images

被引:37
|
作者
Lu, Ming Y. [1 ,2 ,3 ]
Chen, Bowen [2 ,3 ]
Zhang, Andrew [1 ,2 ,3 ]
Williamson, Drew F. K. [2 ,3 ]
Chen, Richard J. [2 ,3 ]
Ding, Tong [2 ,3 ]
Le, Long Phi [2 ,3 ]
Chuang, Yung-Sung [1 ]
Mahmood, Faisal [2 ,3 ]
机构
[1] MIT, Cambridge, MA 02139 USA
[2] Harvard Univ, Cambridge, MA 02138 USA
[3] Mass Gen Brigham, Boston, MA 02199 USA
关键词
SYSTEM;
D O I
10.1109/CVPR52729.2023.01893
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contrastive visual language pretraining has emerged as a powerful method for either training new language-aware image encoders or augmenting existing pretrained models with zero-shot visual recognition capabilities. However, existing works typically train on large datasets of image-text pairs and have been designed to perform downstream tasks involving only small to medium sized-images, neither of which are applicable to the emerging field of computational pathology where there are limited publicly available paired image-text datasets and each image can span up to 100,000 x 100,000 pixels. In this paper we present MI-Zero, a simple and intuitive framework for unleashing the zero-shot transfer capabilities of contrastively aligned image and text models on gigapixel histopathology whole slide images, enabling multiple downstream diagnostic tasks to be carried out by pretrained encoders without requiring any additional labels. MI-Zero reformulates zero-shot transfer under the framework of multiple instance learning to overcome the computational challenge of inference on extremely large images. We used over 550k pathology reports and other available in-domain text corpora to pre-train our text encoder. By effectively leveraging strong pre-trained encoders, our best model pretrained on over 33k histopathology image-caption pairs achieves an average median zero-shot accuracy of 70.2% across three different real-world cancer subtyping tasks. Our code is available at: https://github.com/mahmoodlab/MI-Zero.
引用
收藏
页码:19764 / 19775
页数:12
相关论文
共 50 条
  • [31] Zero-shot Natural Language Video Localization
    Nam, Jinwoo
    Ahn, Daechul
    Kang, Dongyeop
    Ha, Seong Jong
    Choi, Jonghyun
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1450 - 1459
  • [32] Large Language Models are Zero-Shot Reasoners
    Kojima, Takeshi
    Gu, Shixiang Shane
    Reid, Machel
    Matsuo, Yutaka
    Iwasawa, Yusuke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [33] Zero-Shot Learning via Visual Abstraction
    Antol, Stanislaw
    Zitnick, C. Lawrence
    Parikh, Devi
    COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 401 - 416
  • [34] Visual Context Embeddings for Zero-Shot Recognition
    Cho, Gunhee
    Choi, Yong Suk
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 1039 - 1047
  • [35] Transfer and zero-shot learning for scalable weed detection and classification in UAV images
    Belissent, Nicolas
    Pena, Jose M.
    Mesias-Ruiz, Gustavo A.
    Shawe-Taylor, John
    Perez-Ortiz, Maria
    KNOWLEDGE-BASED SYSTEMS, 2024, 292
  • [36] Towards Zero-Shot Sign Language Recognition
    Bilge, Yunus Can
    Cinbis, Ramazan Gokberk
    Ikizler-Cinbis, Nazli
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 1217 - 1232
  • [37] Language Models as Zero-Shot Trajectory Generators
    Kwon, Teyun
    Di Palo, Norman
    Johns, Edward
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (07): : 6728 - 6735
  • [38] ZERO-SHOT PRONUNCIATION LEXICONS FOR CROSS-LANGUAGE ACOUSTIC MODEL TRANSFER
    Wiesner, Matthew
    Adams, Oliver
    Yarowsky, David
    Trmal, Jan
    Khudanpur, Sanjeev
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 1048 - 1054
  • [39] Effective Guidance in Zero-Shot Multilingual Translation via Multiple Language Prototypes
    Zheng, Yafang
    Lin, Lei
    Yuan, Yuxuan
    Shi, Xiaodong
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT VI, 2024, 14452 : 226 - 238
  • [40] Relational Knowledge Transfer for Zero-Shot Learning
    Wang, Donghui
    Li, Yanan
    Lin, Yuetan
    Zhuang, Yueting
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2145 - 2151