Improving the repeatability of deep learning models with Monte Carlo dropout

被引:0
|
作者
Andreanne Lemay
Katharina Hoebel
Christopher P. Bridge
Brian Befano
Silvia De Sanjosé
Didem Egemen
Ana Cecilia Rodriguez
Mark Schiffman
John Peter Campbell
Jayashree Kalpathy-Cramer
机构
[1] Martinos Center for Biomedical Imaging,Department of Epidemiology
[2] NeuroPoly,Division of Cancer Epidemiology & Genetics
[3] Polytechnique Montreal,undefined
[4] Massachusetts Institute of Technology,undefined
[5] MGH & BWH Center for Clinical Data Science,undefined
[6] University of Washington School of Public Health,undefined
[7] National Cancer Institute,undefined
[8] Oregon Health and Science University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The integration of artificial intelligence into clinical workflows requires reliable and robust models. Repeatability is a key attribute of model robustness. Ideal repeatable models output predictions without variation during independent tests carried out under similar conditions. However, slight variations, though not ideal, may be unavoidable and acceptable in practice. During model development and evaluation, much attention is given to classification performance while model repeatability is rarely assessed, leading to the development of models that are unusable in clinical practice. In this work, we evaluate the repeatability of four model types (binary classification, multi-class classification, ordinal classification, and regression) on images that were acquired from the same patient during the same visit. We study the each model’s performance on four medical image classification tasks from public and private datasets: knee osteoarthritis, cervical cancer screening, breast density estimation, and retinopathy of prematurity. Repeatability is measured and compared on ResNet and DenseNet architectures. Moreover, we assess the impact of sampling Monte Carlo dropout predictions at test time on classification performance and repeatability. Leveraging Monte Carlo predictions significantly increases repeatability, in particular at the class boundaries, for all tasks on the binary, multi-class, and ordinal models leading to an average reduction of the 95% limits of agreement by 16% points and of the class disagreement rate by 7% points. The classification accuracy improves in most settings along with the repeatability. Our results suggest that beyond about 20 Monte Carlo iterations, there is no further gain in repeatability. In addition to the higher test-retest agreement, Monte Carlo predictions are better calibrated which leads to output probabilities reflecting more accurately the true likelihood of being correctly classified.
引用
收藏
相关论文
共 50 条
  • [1] Improving the repeatability of deep learning models with Monte Carlo dropout
    Lemay, Andreanne
    Hoebel, Katharina
    Bridge, Christopher P.
    Befano, Brian
    De Sanjose, Silvia
    Egemen, Didem
    Rodriguez, Ana Cecilia
    Schiffman, Mark
    Campbell, John Peter
    Kalpathy-Cramer, Jayashree
    [J]. NPJ DIGITAL MEDICINE, 2022, 5 (01)
  • [2] Monte Carlo dropout for increased deep learning repeatability and disease classification performance in retinopathy of prematurity
    Coyner, Aaron
    Lemay, Andreanne
    Hoebel, Katharina
    Singh, Praveer
    Ostmo, Susan
    Chiang, Michael
    Kalpathy-Cramer, Jayashree
    Campbell, J.
    [J]. INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
  • [3] Assessing the uncertainty of deep learning soil spectral models using Monte Carlo dropout
    Padarian, J.
    Minasny, B.
    McBratney, A. B.
    [J]. GEODERMA, 2022, 425
  • [4] Assessing the uncertainty of deep learning soil spectral models using Monte Carlo dropout
    Padarian, J.
    Minasny, B.
    McBratney, A.B.
    [J]. Geoderma, 2022, 425
  • [5] MONTE CARLO DROPOUT BASED ACTIVE LEARNING FOR DEEP LEARNING IN STRUCTURAL SIMULATION
    Jiang, Chunhao
    Chen, Nian-Zhong
    Zhao, Zhimin
    [J]. PROCEEDINGS OF ASME 2024 43RD INTERNATIONAL CONFERENCE ON OCEAN, OFFSHORE AND ARCTIC ENGINEERING, OMAE2024, VOL 2, 2024,
  • [6] The Effect of Training Data Quantity on Monte Carlo Dropout Uncertainty Quantification in Deep Learning
    Cusack, Harrison
    Bialkowski, Alina
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [7] Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo
    Hoffman, Matthew D.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [8] Improving predictive uncertainty estimation using Dropout–Hamiltonian Monte Carlo
    Sergio Hernández
    Diego Vergara
    Matías Valdenegro-Toro
    Felipe Jorquera
    [J]. Soft Computing, 2020, 24 : 4307 - 4322
  • [9] Challenges for the Repeatability of Deep Learning Models
    Alahmari, Saeed S.
    Goldgof, Dmitry B.
    Mouton, Peter R.
    Hall, Lawrence O.
    [J]. IEEE ACCESS, 2020, 8 : 211860 - 211868
  • [10] Epistemic Uncertainty and Model Transparency in Rock Facies Classification Using Monte Carlo Dropout Deep Learning
    Hossain, Touhid Mohammad
    Hermana, Maman
    Abdulkadir, Said Jadid
    [J]. IEEE ACCESS, 2023, 11 : 89349 - 89358