Improving the repeatability of deep learning models with Monte Carlo dropout

被引:0
|
作者
Andreanne Lemay
Katharina Hoebel
Christopher P. Bridge
Brian Befano
Silvia De Sanjosé
Didem Egemen
Ana Cecilia Rodriguez
Mark Schiffman
John Peter Campbell
Jayashree Kalpathy-Cramer
机构
[1] Martinos Center for Biomedical Imaging,Department of Epidemiology
[2] NeuroPoly,Division of Cancer Epidemiology & Genetics
[3] Polytechnique Montreal,undefined
[4] Massachusetts Institute of Technology,undefined
[5] MGH & BWH Center for Clinical Data Science,undefined
[6] University of Washington School of Public Health,undefined
[7] National Cancer Institute,undefined
[8] Oregon Health and Science University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The integration of artificial intelligence into clinical workflows requires reliable and robust models. Repeatability is a key attribute of model robustness. Ideal repeatable models output predictions without variation during independent tests carried out under similar conditions. However, slight variations, though not ideal, may be unavoidable and acceptable in practice. During model development and evaluation, much attention is given to classification performance while model repeatability is rarely assessed, leading to the development of models that are unusable in clinical practice. In this work, we evaluate the repeatability of four model types (binary classification, multi-class classification, ordinal classification, and regression) on images that were acquired from the same patient during the same visit. We study the each model’s performance on four medical image classification tasks from public and private datasets: knee osteoarthritis, cervical cancer screening, breast density estimation, and retinopathy of prematurity. Repeatability is measured and compared on ResNet and DenseNet architectures. Moreover, we assess the impact of sampling Monte Carlo dropout predictions at test time on classification performance and repeatability. Leveraging Monte Carlo predictions significantly increases repeatability, in particular at the class boundaries, for all tasks on the binary, multi-class, and ordinal models leading to an average reduction of the 95% limits of agreement by 16% points and of the class disagreement rate by 7% points. The classification accuracy improves in most settings along with the repeatability. Our results suggest that beyond about 20 Monte Carlo iterations, there is no further gain in repeatability. In addition to the higher test-retest agreement, Monte Carlo predictions are better calibrated which leads to output probabilities reflecting more accurately the true likelihood of being correctly classified.
引用
收藏
相关论文
共 50 条
  • [41] Probabilistic Forecasting Using Monte Carlo Dropout Neural Networks
    Serpell, Cristian
    Araya, Ignacio
    Valle, Carlos
    Allende, Hector
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 387 - 397
  • [42] Improving Reptation Quantum Monte Carlo
    Yuen, Wai Kong
    Oblinsky, Daniel G.
    Giacometti, Robert D.
    Rothstein, Stuart M.
    [J]. INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY, 2009, 109 (14) : 3229 - 3234
  • [43] Learning from Monte Carlo Rollouts with Opponent Models for Playing Tron
    Knegt, Stefan J. L.
    Drugan, Madalina M.
    Wiering, Marco A.
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2018, 2019, 11352 : 105 - 129
  • [44] Learning undirected graphical models using persistent sequential Monte Carlo
    Xiong, Hanchen
    Szedmak, Sandor
    Piater, Justus
    [J]. MACHINE LEARNING, 2016, 103 (02) : 239 - 260
  • [45] Improving Monte Carlo integration by symmetrization
    Hartung, Tobias
    Jansen, Karl
    Leoevey, Hernan
    Volmer, Julia
    [J]. DIVERSITY AND BEAUTY OF APPLIED OPERATOR THEORY, 2018, 268 : 291 - 317
  • [46] Learning undirected graphical models using persistent sequential Monte Carlo
    Hanchen Xiong
    Sandor Szedmak
    Justus Piater
    [J]. Machine Learning, 2016, 103 : 239 - 260
  • [47] Missing Data Imputation and Acquisition with Deep Hierarchical Models and Hamiltonian Monte Carlo
    Peis, Ignacio
    Ma, Chao
    Hernandez-Lobato, Jose Miguel
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [48] On Multilevel Monte Carlo Unbiased Gradient Estimation For Deep Latent Variable Models
    Shi, Yuyang
    Cornish, Rob
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [49] Deep Learning to improve Experimental Sensitivity and Generative Models for Monte Carlo simulations for searching for New Physics in LHC experiments
    Salt, Jose
    Balanza, Raul
    Garcia, Azael
    Ander Gomez, Jon
    Gonzalez de la Hoz, Santiago
    Lozano, Julio
    Ruiz de Austri, Roberto
    Villaplana, Miguel
    [J]. 26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
  • [50] Flipover outperforms dropout in deep learning
    Yuxuan Liang
    Chuang Niu
    Pingkun Yan
    Ge Wang
    [J]. Visual Computing for Industry, Biomedicine, and Art, 7