Generative models improve fairness of medical classifiers under distribution shifts

被引:18
|
作者
Ktena, Ira [1 ]
Wiles, Olivia [1 ]
Albuquerque, Isabela [1 ]
Rebuffi, Sylvestre-Alvise [1 ]
Tanno, Ryutaro [1 ]
Roy, Abhijit Guha [2 ]
Azizi, Shekoofeh [1 ]
Belgrave, Danielle [3 ]
Kohli, Pushmeet [1 ]
Cemgil, Taylan [1 ]
Karthikesalingam, Alan [2 ]
Gowal, Sven [1 ]
机构
[1] Google DeepMind, London, England
[2] Google Res, London, England
[3] GSKai, London, England
关键词
PERFORMANCE; IMAGES;
D O I
10.1038/s41591-024-02838-6
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution. By generating synthetic image samples specific to underrepresented groups, diffusion models help medical image classifiers to achieve greater fairness metrics across a variety of medical disciplines and demographic attributes.
引用
收藏
页码:1166 / 1173
页数:8
相关论文
共 50 条
  • [31] Uncertainty propagation in vegetation distribution models based on ensemble classifiers
    Peters, Jan
    Verhoest, Niko E. C.
    Samson, Roeland
    Van Meirvenne, Marc
    Cockx, Liesbet
    De Baets, Bernard
    ECOLOGICAL MODELLING, 2009, 220 (06) : 791 - 804
  • [32] Hybrid Bayesian network classifiers: Application to species distribution models
    Aguilera, P. A.
    Fernandez, A.
    Reche, F.
    Rumi, R.
    ENVIRONMENTAL MODELLING & SOFTWARE, 2010, 25 (12) : 1630 - 1639
  • [33] CrossNorm and SelfNorm for Generalization under Distribution Shifts
    Tang, Zhiqiang
    Gao, Yunhe
    Zhu, Yi
    Zhang, Zhi
    Li, Mu
    Metaxas, Dimitris
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 52 - 61
  • [34] Man-in-the-Middle Attacks Against Machine Learning Classifiers Via Malicious Generative Models
    Wang, Derui
    Li, Chaoran
    Wen, Sheng
    Nepal, Surya
    Xiang, Yang
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (05) : 2074 - 2087
  • [35] Using generative AI to investigate medical imagery models and datasets
    Liu, Yun
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
  • [36] Using generative AI to investigate medical imagery models and datasets
    Lang, Oran
    Yaya-Stupp, Doron
    Traynis, Ilana
    Cole-Lewis, Heather
    Bennett, Chloe R.
    Lyles, Courtney R.
    Lau, Charles
    Irani, Michal
    Semturs, Christopher
    Webster, Dale R.
    Corrado, Greg S.
    Hassidim, Avinatan
    Matias, Yossi
    Liu, Yun
    Hammel, Naama
    Babenko, Boris
    EBIOMEDICINE, 2024, 102
  • [37] Landmark Localization From Medical Images With Generative Distribution Prior
    Huang, Zixun
    Zhao, Rui
    Leung, Frank H. F.
    Banerjee, Sunetra
    Lam, Kin-Man
    Zheng, Yong-Ping
    Ling, Sai Ho
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (07) : 2679 - 2692
  • [38] Evaluating Latent Space Robustness and Uncertainty of EEG-ML Models under Realistic Distribution Shifts
    Wagh, Neeraj
    Wei, Jionghao
    Rawal, Samarth
    Berry, Brent
    Varatharajah, Yogatheesan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [39] Acoustic species distribution models (aSDMs): A framework to forecast shifts in calling behaviour under climate change
    Desjonqueres, Camille
    Villen-Perez, Sara
    De Marco, Paulo
    Marquez, Rafael
    Beltran, Juan F.
    Llusia, Diego
    METHODS IN ECOLOGY AND EVOLUTION, 2022, 13 (10): : 2275 - 2288
  • [40] Deep Generative Models for Distribution-Preserving Lossy Compression
    Tschannen, Michael
    Agustsson, Eirikur
    Lucic, Mario
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31