Assessing and Mitigating Bias in Medical Artificial Intelligence The Effects of Race and Ethnicity on a Deep Learning Model for ECG Analysis

被引:129
|
作者
Noseworthy, Peter A. [1 ,2 ]
Attia, Zachi, I [1 ]
Brewer, LaPrincess C. [1 ]
Hayes, Sharonne N. [1 ,3 ]
Yao, Xiaoxi [1 ,2 ,4 ]
Kapa, Suraj [1 ]
Friedman, Paul A. [1 ]
Lopez-Jimenez, Francisco [1 ]
机构
[1] Mayo Clin, Dept Cardiovasc Med, 200 1st St SW, Rochester, MN 55905 USA
[2] Mayo Clin, Robert D & Patricia E Kern Ctr Sci Hlth Care Deli, Rochester, MN 55905 USA
[3] Mayo Clin, Off Divers & Inclus, Rochester, MN 55905 USA
[4] Mayo Clin, Div Hlth Care Policy & Res, Dept Hlth Sci Res, Rochester, MN 55905 USA
来源
基金
美国国家卫生研究院;
关键词
artificial intelligence; electrocardiography; humans; machine learning; United States; DYSFUNCTION; SEX;
D O I
10.1161/CIRCEP.119.007988
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background: Deep learning algorithms derived in homogeneous populations may be poorly generalizable and have the potential to reflect, perpetuate, and even exacerbate racial/ethnic disparities in health and health care. In this study, we aimed to (1) assess whether the performance of a deep learning algorithm designed to detect low left ventricular ejection fraction using the 12-lead ECG varies by race/ethnicity and to (2) determine whether its performance is determined by the derivation population or by racial variation in the ECG. Methods: We performed a retrospective cohort analysis that included 97 829 patients with paired ECGs and echocardiograms. We tested the model performance by race/ethnicity for convolutional neural network designed to identify patients with a left ventricular ejection fraction <= 35% from the 12-lead ECG. Results: The convolutional neural network that was previously derived in a homogeneous population (derivation cohort, n=44 959; 96.2% non-Hispanic white) demonstrated consistent performance to detect low left ventricular ejection fraction across a range of racial/ethnic subgroups in a separate testing cohort (n=52 870): non-Hispanic white (n=44 524; area under the curve [AUC], 0.931), Asian (n=557; AUC, 0.961), black/African American (n=651; AUC, 0.937), Hispanic/Latino (n=331; AUC, 0.937), and American Indian/Native Alaskan (n=223; AUC, 0.938). In secondary analyses, a separate neural network was able to discern racial subgroup category (black/African American [AUC, 0.84], and white, non-Hispanic [AUC, 0.76] in a 5-class classifier), and a network trained only in non-Hispanic whites from the original derivation cohort performed similarly well across a range of racial/ethnic subgroups in the testing cohort with an AUC of at least 0.930 in all racial/ethnic subgroups. Conclusions: Our study demonstrates that while ECG characteristics vary by race, this did not impact the ability of a convolutional neural network to predict low left ventricular ejection fraction from the ECG. We recommend reporting of performance among diverse ethnic, racial, age, and sex groups for all new artificial intelligence tools to ensure responsible use of artificial intelligence in medicine.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Assessing and Mitigating Bias in Artificial Intelligence: A Review
    Sinha A.
    Sapra D.
    Sinwar D.
    Singh V.
    Raghuwanshi G.
    Recent Advances in Computer Science and Communications, 2024, 17 (01) : 1 - 10
  • [2] A Call to Action on Assessing and Mitigating Bias in Artificial Intelligence Applications for Mental Health
    Timmons, Adela C.
    Duong, Jacqueline B.
    Simo Fiallo, Natalia
    Lee, Theodore
    Vo, Huong Phuc Quynh
    Ahle, Matthew W.
    Comer, Jonathan S.
    Brewer, LaPrincess C.
    Frazier, Stacy L.
    Chaspari, Theodora
    PERSPECTIVES ON PSYCHOLOGICAL SCIENCE, 2023, 18 (05) : 1062 - 1096
  • [3] Responsible Artificial Intelligence and Bias Mitigation in Deep Learning Systems
    Gavrilova, Marina L.
    2023 27TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION, IV, 2023, : 329 - 333
  • [4] On Artificial Intelligence and Deep Learning Within Medical Education
    Carin, Lawrence
    ACADEMIC MEDICINE, 2020, 95 (11) : S10 - S11
  • [5] Explainable artificial intelligence (XAI) in deep learning-based medical image analysis
    van der Velden, Bas H.M.
    Kuijf, Hugo J.
    Gilhuijs, Kenneth G.A.
    Viergever, Max A.
    Medical Image Analysis, 2022, 79
  • [6] Explainable artificial intelligence (XAI) in deep learning-based medical image analysis
    Van der Velden, Bas H. M.
    Kuijf, Hugo J.
    Gilhuijs, Kenneth G. A.
    Viergever, Max A.
    MEDICAL IMAGE ANALYSIS, 2022, 79
  • [7] Weighing the benefits and risks of collecting race and ethnicity data in clinical settings for medical artificial intelligence
    Fiske, Amelia
    Blacker, Sarah
    Genevieve, Lester Darryl
    Willem, Theresa
    Fritzsche, Marie-Christine
    Buyx, Alena
    Celi, Leo Anthony
    McLennan, Stuart
    LANCET DIGITAL HEALTH, 2025, 7 (03): : e286 - e294
  • [8] Race bias analysis of a deep learning-based prostate MR autocontouring model
    Alqarni, Maram
    Jones, Emma
    Ribeiro, Luis
    Hema, Verma
    Cooper, Sian
    Morris, Stephen
    Urbano, Teresa Guerrero
    King, Andrew P.
    RADIOTHERAPY AND ONCOLOGY, 2024, 194 : S3157 - S3160
  • [9] Comparison of two artificial intelligence-augmented ECG approaches: Machine learning and deep learning
    Kashou, Anthony H.
    May, Adam M.
    Noseworthy, Peter A.
    JOURNAL OF ELECTROCARDIOLOGY, 2023, 79 : 75 - 80
  • [10] Assessing deep learning: a work program for the humanities in the age of artificial intelligence
    Jan Segessenmann
    Thilo Stadelmann
    Andrew Davison
    Oliver Dürr
    AI and Ethics, 2025, 5 (1): : 1 - 32