Generative models improve fairness of medical classifiers under distribution shifts

被引:18
|
作者
Ktena, Ira [1 ]
Wiles, Olivia [1 ]
Albuquerque, Isabela [1 ]
Rebuffi, Sylvestre-Alvise [1 ]
Tanno, Ryutaro [1 ]
Roy, Abhijit Guha [2 ]
Azizi, Shekoofeh [1 ]
Belgrave, Danielle [3 ]
Kohli, Pushmeet [1 ]
Cemgil, Taylan [1 ]
Karthikesalingam, Alan [2 ]
Gowal, Sven [1 ]
机构
[1] Google DeepMind, London, England
[2] Google Res, London, England
[3] GSKai, London, England
关键词
PERFORMANCE; IMAGES;
D O I
10.1038/s41591-024-02838-6
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution. By generating synthetic image samples specific to underrepresented groups, diffusion models help medical image classifiers to achieve greater fairness metrics across a variety of medical disciplines and demographic attributes.
引用
收藏
页码:1166 / 1173
页数:8
相关论文
共 50 条
  • [41] Drawing out of Distribution with Neuro-Symbolic Generative Models
    Liang, Yichao
    Tenenbaum, Joshua B.
    Le, Tuan Anh
    Siddharth, N.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [42] On the Distribution of Speaker Verification Scores: Generative Models for Unsupervised Calibration
    Cumani, Sandro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 547 - 562
  • [43] Graph Neural Architecture Search Under Distribution Shifts
    Qin, Yijian
    Wang, Xin
    Zhang, Ziwei
    Xie, Pengtao
    Zhu, Wenwu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [44] Diagnosing and Repairing Feature Representations under Distribution Shifts
    Lourenco, Ines
    Bobu, Andreea
    Rojas, Cristian R.
    Wahlberg, Bo
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3638 - 3645
  • [45] Visual Turing test is not sufficient to evaluate the performance of medical generative models
    Yamamoto, Shoichiro
    Higaki, Akinori
    EUROPEAN RADIOLOGY EXPERIMENTAL, 2023, 7 (01)
  • [46] Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging
    Kim, Kiduk
    Hong, Gil-Sun
    Kim, Namkug
    JOURNAL OF THE KOREAN SOCIETY OF RADIOLOGY, 2024, 85 (05): : 848 - 860
  • [47] Improving medical machine learning models with generative balancing for equity and excellence
    Theodorou, Brandon
    Danek, Benjamin
    Tummala, Venkat
    Kumar, Shivam Pankaj
    Malin, Bradley
    Sun, Jimeng
    NPJ DIGITAL MEDICINE, 2025, 8 (01):
  • [48] Accuracy and Fairness for Web-Based Content Analysis under Temporal Shifts and Delayed Labeling
    Almuzaini, Abdulaziz A.
    Pennock, David M.
    Singh, Vivek K.
    16TH ACM WEB SCIENCE CONFERENCE, WEBSCIENCE 2024, 2024, : 268 - 278
  • [49] Medical education empowered by generative artificial intelligence large language models
    Jowsey, Tanisha
    Stokes-Parish, Jessica
    Singleton, Rachelle
    Todorovic, Michael
    TRENDS IN MOLECULAR MEDICINE, 2023, 29 (12) : 971 - 973
  • [50] Visual Turing test is not sufficient to evaluate the performance of medical generative models
    Shoichiro Yamamoto
    Akinori Higaki
    European Radiology Experimental, 7