Generative models improve fairness of medical classifiers under distribution shifts

被引：18

作者：

Ktena, Ira ^{[1
]}

Wiles, Olivia ^{[1
]}

Albuquerque, Isabela ^{[1
]}

Rebuffi, Sylvestre-Alvise ^{[1
]}

Tanno, Ryutaro ^{[1
]}

Roy, Abhijit Guha ^{[2
]}

Azizi, Shekoofeh ^{[1
]}

Belgrave, Danielle ^{[3
]}

Kohli, Pushmeet ^{[1
]}

Cemgil, Taylan ^{[1
]}

Karthikesalingam, Alan ^{[2
]}

Gowal, Sven ^{[1
]}

机构：

[1] Google DeepMind, London, England

[2] Google Res, London, England

[3] GSKai, London, England

来源：

NATURE MEDICINE | 2024年 / 30卷 / 04期

关键词：

PERFORMANCE; IMAGES;

D O I：

10.1038/s41591-024-02838-6

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution. By generating synthetic image samples specific to underrepresented groups, diffusion models help medical image classifiers to achieve greater fairness metrics across a variety of medical disciplines and demographic attributes.

引用

页码：1166 / 1173

页数：8

共 50 条

[41] Drawing out of Distribution with Neuro-Symbolic Generative Models
Liang, Yichao
Tenenbaum, Joshua B.
Le, Tuan Anh
Siddharth, N.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[42] On the Distribution of Speaker Verification Scores: Generative Models for Unsupervised Calibration
Cumani, Sandro
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 547 - 562
[43] Graph Neural Architecture Search Under Distribution Shifts
Qin, Yijian
Wang, Xin
Zhang, Ziwei
Xie, Pengtao
Zhu, Wenwu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[44] Diagnosing and Repairing Feature Representations under Distribution Shifts
Lourenco, Ines
Bobu, Andreea
Rojas, Cristian R.
Wahlberg, Bo
2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3638 - 3645
[45] Visual Turing test is not sufficient to evaluate the performance of medical generative models
Yamamoto, Shoichiro
Higaki, Akinori
EUROPEAN RADIOLOGY EXPERIMENTAL, 2023, 7 (01)
[46] Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging
Kim, Kiduk
Hong, Gil-Sun
Kim, Namkug
JOURNAL OF THE KOREAN SOCIETY OF RADIOLOGY, 2024, 85 (05): : 848 - 860
[47] Improving medical machine learning models with generative balancing for equity and excellence
Theodorou, Brandon
Danek, Benjamin
Tummala, Venkat
Kumar, Shivam Pankaj
Malin, Bradley
Sun, Jimeng
NPJ DIGITAL MEDICINE, 2025, 8 (01):
[48] Accuracy and Fairness for Web-Based Content Analysis under Temporal Shifts and Delayed Labeling
Almuzaini, Abdulaziz A.
Pennock, David M.
Singh, Vivek K.
16TH ACM WEB SCIENCE CONFERENCE, WEBSCIENCE 2024, 2024, : 268 - 278
[49] Medical education empowered by generative artificial intelligence large language models
Jowsey, Tanisha
Stokes-Parish, Jessica
Singleton, Rachelle
Todorovic, Michael
TRENDS IN MOLECULAR MEDICINE, 2023, 29 (12) : 971 - 973
[50] Visual Turing test is not sufficient to evaluate the performance of medical generative models
Shoichiro Yamamoto
Akinori Higaki
European Radiology Experimental, 7

← 1 2 3 4 5 →