Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images

被引:37
|
作者
Lu, Ming Y. [1 ,2 ,3 ]
Chen, Bowen [2 ,3 ]
Zhang, Andrew [1 ,2 ,3 ]
Williamson, Drew F. K. [2 ,3 ]
Chen, Richard J. [2 ,3 ]
Ding, Tong [2 ,3 ]
Le, Long Phi [2 ,3 ]
Chuang, Yung-Sung [1 ]
Mahmood, Faisal [2 ,3 ]
机构
[1] MIT, Cambridge, MA 02139 USA
[2] Harvard Univ, Cambridge, MA 02138 USA
[3] Mass Gen Brigham, Boston, MA 02199 USA
关键词
SYSTEM;
D O I
10.1109/CVPR52729.2023.01893
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contrastive visual language pretraining has emerged as a powerful method for either training new language-aware image encoders or augmenting existing pretrained models with zero-shot visual recognition capabilities. However, existing works typically train on large datasets of image-text pairs and have been designed to perform downstream tasks involving only small to medium sized-images, neither of which are applicable to the emerging field of computational pathology where there are limited publicly available paired image-text datasets and each image can span up to 100,000 x 100,000 pixels. In this paper we present MI-Zero, a simple and intuitive framework for unleashing the zero-shot transfer capabilities of contrastively aligned image and text models on gigapixel histopathology whole slide images, enabling multiple downstream diagnostic tasks to be carried out by pretrained encoders without requiring any additional labels. MI-Zero reformulates zero-shot transfer under the framework of multiple instance learning to overcome the computational challenge of inference on extremely large images. We used over 550k pathology reports and other available in-domain text corpora to pre-train our text encoder. By effectively leveraging strong pre-trained encoders, our best model pretrained on over 33k histopathology image-caption pairs achieves an average median zero-shot accuracy of 70.2% across three different real-world cancer subtyping tasks. Our code is available at: https://github.com/mahmoodlab/MI-Zero.
引用
收藏
页码:19764 / 19775
页数:12
相关论文
共 50 条
  • [41] Hypernetworks for Zero-Shot Transfer in Reinforcement Learning
    Rezaei-Shoshtari, Sahand
    Morissette, Charlotte
    Hogan, Francois R.
    Dudek, Gregory
    Meger, David
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9579 - 9587
  • [42] Zero-Shot Transfer Learning for Event Extraction
    Huang, Lifu
    Ji, Heng
    Cho, Kyunghyun
    Dagan, Ido
    Riedel, Sebastian
    Voss, Clare R.
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2160 - 2170
  • [43] ZVQAF: Zero-shot visual question answering with feedback from large language models
    Liu, Cheng
    Wang, Chao
    Peng, Yan
    Li, Zhixu
    NEUROCOMPUTING, 2024, 580
  • [44] Combined scaling for zero-shot transfer learning
    Pham, Hieu
    Dai, Zihang
    Ghiasi, Golnaz
    Kawaguchi, Kenji
    Liu, Hanxiao
    Yu, Adams Wei
    Yu, Jiahui
    Chen, Yi-Ting
    Luong, Minh-Thang
    Wu, Yonghui
    Tan, Mingxing
    V. Le, Quoc
    NEUROCOMPUTING, 2023, 555
  • [45] Transfer Increment for Generalized Zero-Shot Learning
    Feng, Liangjun
    Zhao, Chunhui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2506 - 2520
  • [46] Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning
    Kim, Yu Jin
    Kwak, Beong-woo
    Kim, Youngwook
    Amplayo, Reinald Kim
    Hwang, Seung-won
    Yeo, Jinyoung
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2244 - 2257
  • [47] AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
    Ebrahimi, Abteen
    Mager, Manuel
    Oncevay, Arturo
    Chaudhary, Vishrav
    Chiruzzo, Luis
    Fan, Angela
    Ortega, John E.
    Ramos, Ricardo
    Rios, Annette
    Meza-Ruiz, Ivan
    Gimenez-Lugo, Gustavo A.
    Mager, Elisabeth
    Neubig, Graham
    Palmer, Alexis
    Coto-Solano, Rolando
    Ngoc Thang Vu
    Kann, Katharina
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6279 - 6299
  • [48] Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
    Li, Lin
    Xiao, Jun
    Chen, Guikun
    Shao, Jian
    Zhuang, Yueting
    Chen, Long
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Few-Shot and Zero-Shot Semantic Segmentation for Food Images
    Honbu, Yuma
    Yanai, Keiji
    PROCEEDINGS OF THE 13TH INTERNATIONAL WORKSHOP ON MULTIMEDIA FOR COOKING AND EATING ACTIVITIES (CEA '21), 2021, : 25 - 28
  • [50] Webly-supervised zero-shot learning for artwork instance recognition
    Del Chiaro, Riccardo
    Bagdanov, Andrew D.
    Del Bimbo, Alberto
    PATTERN RECOGNITION LETTERS, 2019, 128 : 420 - 426