Generative models improve fairness of medical classifiers under distribution shifts

被引：18

作者：

Ktena, Ira ^{[1
]}

Wiles, Olivia ^{[1
]}

Albuquerque, Isabela ^{[1
]}

Rebuffi, Sylvestre-Alvise ^{[1
]}

Tanno, Ryutaro ^{[1
]}

Roy, Abhijit Guha ^{[2
]}

Azizi, Shekoofeh ^{[1
]}

Belgrave, Danielle ^{[3
]}

Kohli, Pushmeet ^{[1
]}

Cemgil, Taylan ^{[1
]}

Karthikesalingam, Alan ^{[2
]}

Gowal, Sven ^{[1
]}

机构：

[1] Google DeepMind, London, England

[2] Google Res, London, England

[3] GSKai, London, England

来源：

NATURE MEDICINE | 2024年 / 30卷 / 04期

关键词：

PERFORMANCE; IMAGES;

D O I：

10.1038/s41591-024-02838-6

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution. By generating synthetic image samples specific to underrepresented groups, diffusion models help medical image classifiers to achieve greater fairness metrics across a variety of medical disciplines and demographic attributes.

引用

页码：1166 / 1173

页数：8

共 50 条

[31] Uncertainty propagation in vegetation distribution models based on ensemble classifiers
Peters, Jan
Verhoest, Niko E. C.
Samson, Roeland
Van Meirvenne, Marc
Cockx, Liesbet
De Baets, Bernard
ECOLOGICAL MODELLING, 2009, 220 (06) : 791 - 804
[32] Hybrid Bayesian network classifiers: Application to species distribution models
Aguilera, P. A.
Fernandez, A.
Reche, F.
Rumi, R.
ENVIRONMENTAL MODELLING & SOFTWARE, 2010, 25 (12) : 1630 - 1639
[33] CrossNorm and SelfNorm for Generalization under Distribution Shifts
Tang, Zhiqiang
Gao, Yunhe
Zhu, Yi
Zhang, Zhi
Li, Mu
Metaxas, Dimitris
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 52 - 61
[34] Man-in-the-Middle Attacks Against Machine Learning Classifiers Via Malicious Generative Models
Wang, Derui
Li, Chaoran
Wen, Sheng
Nepal, Surya
Xiang, Yang
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (05) : 2074 - 2087
[35] Using generative AI to investigate medical imagery models and datasets
Liu, Yun
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
[36] Using generative AI to investigate medical imagery models and datasets
Lang, Oran
Yaya-Stupp, Doron
Traynis, Ilana
Cole-Lewis, Heather
Bennett, Chloe R.
Lyles, Courtney R.
Lau, Charles
Irani, Michal
Semturs, Christopher
Webster, Dale R.
Corrado, Greg S.
Hassidim, Avinatan
Matias, Yossi
Liu, Yun
Hammel, Naama
Babenko, Boris
EBIOMEDICINE, 2024, 102
[37] Landmark Localization From Medical Images With Generative Distribution Prior
Huang, Zixun
Zhao, Rui
Leung, Frank H. F.
Banerjee, Sunetra
Lam, Kin-Man
Zheng, Yong-Ping
Ling, Sai Ho
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (07) : 2679 - 2692
[38] Evaluating Latent Space Robustness and Uncertainty of EEG-ML Models under Realistic Distribution Shifts
Wagh, Neeraj
Wei, Jionghao
Rawal, Samarth
Berry, Brent
Varatharajah, Yogatheesan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[39] Acoustic species distribution models (aSDMs): A framework to forecast shifts in calling behaviour under climate change
Desjonqueres, Camille
Villen-Perez, Sara
De Marco, Paulo
Marquez, Rafael
Beltran, Juan F.
Llusia, Diego
METHODS IN ECOLOGY AND EVOLUTION, 2022, 13 (10): : 2275 - 2288
[40] Deep Generative Models for Distribution-Preserving Lossy Compression
Tschannen, Michael
Agustsson, Eirikur
Lucic, Mario
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

← 1 2 3 4 5 →