A whole-slide foundation model for digital pathology from real-world data

被引:6
|
作者
Xu, Hanwen [1 ,2 ]
Usuyama, Naoto [1 ]
Bagga, Jaspreet [1 ]
Zhang, Sheng [1 ]
Rao, Rajesh [1 ]
Naumann, Tristan [1 ]
Wong, Cliff [1 ]
Gero, Zelalem [1 ]
Gonzalez, Javier [1 ]
Gu, Yu [1 ]
Xu, Yanbo [1 ]
Wei, Mu [1 ]
Wang, Wenhui [1 ]
Ma, Shuming [1 ]
Wei, Furu [1 ]
Yang, Jianwei [1 ]
Li, Chunyuan [1 ]
Gao, Jianfeng [1 ]
Rosemon, Jaylen [3 ]
Bower, Tucker [3 ]
Lee, Soohee [4 ]
Weerasinghe, Roshanthi [4 ]
Wright, Bill J. [4 ]
Robicsek, Ari [4 ]
Piening, Brian [3 ,5 ]
Bifulco, Carlo [3 ,5 ]
Wang, Sheng [2 ,6 ]
Poon, Hoifung [1 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Univ Washington, Paul G Allen Sch Comp Sci Engn, Seattle, WA 98195 USA
[3] Providence Genom, Portland, OR 97225 USA
[4] Providence Res Network, Renton, WA USA
[5] Earle A Chiles Res Inst, Providence Canc Inst, Portland, OR 97213 USA
[6] Univ Washington, Dept Surg, Seattle, WA 98195 USA
关键词
D O I
10.1038/s41586-024-07441-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles 1-3 . Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context 4 . Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 x 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet 5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data 6 . With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision-language pretraining for pathology 7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling. Prov-GigaPath, a whole-slide pathology foundation model pretrained on a large dataset containing around 1.3 billion pathology images, attains state-of-the-art performance in cancer classification and pathomics tasks.
引用
收藏
页码:181 / 188
页数:22
相关论文
共 50 条
  • [21] SlideQC: An AI-based tool for automated quality control of whole-slide digital pathology images
    Rodrigues, Daniela
    Reinhard, Stefan
    Waldburger, Therese
    Martin, Daniel
    Couto, Suzana
    Zlobec, Inti
    Caie, Peter
    Burlingame, Erik
    [J]. CANCER RESEARCH, 2023, 83 (07)
  • [22] A Digital Pathology Application for Whole-Slide Histopathology Image Analysis based on Genetic Algorithm and Convolutional Networks
    Puerto, Mateo
    Vargas, Tania
    Cruz-Roa, Angel
    [J]. 2016 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2016,
  • [23] Real-World Implementation of Digital Pathology: Results From an Intercontinental Survey
    Pinto, Daniel Gomes
    Bychkov, Andrey
    Tsuyama, Naoko
    Fukuoka, Junya
    Eloy, Catarina
    [J]. LABORATORY INVESTIGATION, 2023, 103 (12)
  • [24] Staining condition visualization in digital histopathological whole-slide images
    Jiao, Yiping
    Li, Junhong
    Fei, Shumin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (13) : 17831 - 17847
  • [25] Whole-Slide Images Versus Glass Slides for Pathology Resident Education
    Fox, Sharon
    Heide, Richard Vander
    Law, C.
    Faulkner-Jones, Beverly
    [J]. LABORATORY INVESTIGATION, 2015, 95 : 129A - 129A
  • [26] Whole-Slide Images Versus Glass Slides for Pathology Resident Education
    Fox, Sharon
    Vander Heide, Richard
    Law, C.
    Faulkner-Jones, Beverly
    [J]. MODERN PATHOLOGY, 2015, 28 : 129A - 129A
  • [27] Staining condition visualization in digital histopathological whole-slide images
    Yiping Jiao
    Junhong Li
    Shumin Fei
    [J]. Multimedia Tools and Applications, 2022, 81 : 17831 - 17847
  • [28] Real-world study: from real-world data to real-world evidence
    Wen, Yi
    [J]. TRANSLATIONAL BREAST CANCER RESEARCH, 2020, 1
  • [29] A cognitive model of whole-slide image viewing and interpretation
    Jofre, Sebastian
    Powell, Callan
    Breen, David
    Garcia, Fernando
    Zarella, Mark
    [J]. MODERN PATHOLOGY, 2019, 32
  • [30] A cognitive model of whole-slide image viewing and interpretation
    Jofre, Sebastian
    Powell, Callan
    Breen, David
    Garcia, Fernando
    Zarella, Mark
    [J]. LABORATORY INVESTIGATION, 2019, 99