Assessing and Mitigating Bias in Medical Artificial Intelligence The Effects of Race and Ethnicity on a Deep Learning Model for ECG Analysis

被引:129
|
作者
Noseworthy, Peter A. [1 ,2 ]
Attia, Zachi, I [1 ]
Brewer, LaPrincess C. [1 ]
Hayes, Sharonne N. [1 ,3 ]
Yao, Xiaoxi [1 ,2 ,4 ]
Kapa, Suraj [1 ]
Friedman, Paul A. [1 ]
Lopez-Jimenez, Francisco [1 ]
机构
[1] Mayo Clin, Dept Cardiovasc Med, 200 1st St SW, Rochester, MN 55905 USA
[2] Mayo Clin, Robert D & Patricia E Kern Ctr Sci Hlth Care Deli, Rochester, MN 55905 USA
[3] Mayo Clin, Off Divers & Inclus, Rochester, MN 55905 USA
[4] Mayo Clin, Div Hlth Care Policy & Res, Dept Hlth Sci Res, Rochester, MN 55905 USA
来源
基金
美国国家卫生研究院;
关键词
artificial intelligence; electrocardiography; humans; machine learning; United States; DYSFUNCTION; SEX;
D O I
10.1161/CIRCEP.119.007988
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background: Deep learning algorithms derived in homogeneous populations may be poorly generalizable and have the potential to reflect, perpetuate, and even exacerbate racial/ethnic disparities in health and health care. In this study, we aimed to (1) assess whether the performance of a deep learning algorithm designed to detect low left ventricular ejection fraction using the 12-lead ECG varies by race/ethnicity and to (2) determine whether its performance is determined by the derivation population or by racial variation in the ECG. Methods: We performed a retrospective cohort analysis that included 97 829 patients with paired ECGs and echocardiograms. We tested the model performance by race/ethnicity for convolutional neural network designed to identify patients with a left ventricular ejection fraction <= 35% from the 12-lead ECG. Results: The convolutional neural network that was previously derived in a homogeneous population (derivation cohort, n=44 959; 96.2% non-Hispanic white) demonstrated consistent performance to detect low left ventricular ejection fraction across a range of racial/ethnic subgroups in a separate testing cohort (n=52 870): non-Hispanic white (n=44 524; area under the curve [AUC], 0.931), Asian (n=557; AUC, 0.961), black/African American (n=651; AUC, 0.937), Hispanic/Latino (n=331; AUC, 0.937), and American Indian/Native Alaskan (n=223; AUC, 0.938). In secondary analyses, a separate neural network was able to discern racial subgroup category (black/African American [AUC, 0.84], and white, non-Hispanic [AUC, 0.76] in a 5-class classifier), and a network trained only in non-Hispanic whites from the original derivation cohort performed similarly well across a range of racial/ethnic subgroups in the testing cohort with an AUC of at least 0.930 in all racial/ethnic subgroups. Conclusions: Our study demonstrates that while ECG characteristics vary by race, this did not impact the ability of a convolutional neural network to predict low left ventricular ejection fraction from the ECG. We recommend reporting of performance among diverse ethnic, racial, age, and sex groups for all new artificial intelligence tools to ensure responsible use of artificial intelligence in medicine.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Prediction Model for Students' Future Development by Deep Learning and Tensorflow Artificial Intelligence Engine
    Fok, Wilton W. T.
    He, Y. S.
    Yeung, H. H. Au
    Law, K. Y.
    Cheung, K. H.
    Ai, Y. Y.
    Ho, P.
    2018 4TH INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT (ICIM2018), 2018, : 103 - 106
  • [42] Towards Robust Contrail Detection by Mitigating Label Bias via a Probabilistic Deep Learning Model: A Preliminary Study
    Lee, Yejun
    Kim, Eun-Kyeong
    Yoo, Jaejun
    31ST ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2023, 2023, : 5 - 6
  • [43] Comparative analysis of chronic progressive nephropathy (CPN) diagnosis in rat kidneys using an artificial intelligence deep learning model
    Bae, Yeji
    Byun, Jongsu
    Lee, Hangyu
    Han, Beomseok
    TOXICOLOGICAL RESEARCH, 2024, 40 (04) : 551 - 559
  • [44] Deep learning in knee imaging: a systematic review utilizing a Checklist for Artificial Intelligence in Medical Imaging (CLAIM)
    Liping Si
    Jingyu Zhong
    Jiayu Huo
    Kai Xuan
    Zixu Zhuang
    Yangfan Hu
    Qian Wang
    Huan Zhang
    Weiwu Yao
    European Radiology, 2022, 32 : 1353 - 1361
  • [45] Artificial Intelligence in Diabetic Retinopathy: Insights from a Meta-Analysis of Deep Learning
    Poly, Tahmina Nasrin
    Islam, Md Mohaimenul
    Yang, Hsuan Chia
    Nguyen, Phung-Anh
    Wu, Chieh Chen
    Li, Yu-Chuan
    MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 1556 - 1557
  • [46] Artificial Intelligence and Machine (Deep) Learning in Otorhinolaryngology: A Bibliometric Analysis Based on VOSviewer and CiteSpace
    Ma, Tianyu
    Wu, Qilong
    Jiang, Li
    Zeng, Xiaoyun
    Wang, Yuyao
    Yuan, Yi
    Wang, Bingxuan
    Zhang, Tianhong
    ENT-EAR NOSE & THROAT JOURNAL, 2023,
  • [47] Sentiment analysis of pets using deep learning technologies in artificial intelligence of things system
    Tsai, Ming-Fong
    Huang, Jhao-Yang
    SOFT COMPUTING, 2021, 25 (21) : 13741 - 13752
  • [48] Sentiment analysis of pets using deep learning technologies in artificial intelligence of things system
    Ming-Fong Tsai
    Jhao-Yang Huang
    Soft Computing, 2021, 25 : 13741 - 13752
  • [49] BIG DATA ANALYSIS AND DEEP LEARNING OPTIMIZATION IN ARTIFICIAL INTELLIGENCE PRODUCTION OF INFORMATION ENTERPRISES
    Gao, Na
    Lu, Qiuling
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (03): : 1533 - 1540
  • [50] BIG DATA ANALYSIS AND DEEP LEARNING OPTIMIZATION IN ARTIFICIAL INTELLIGENCE PRODUCTION OF INFORMATION ENTERPRISES
    Gao N.
    Lu Q.
    Scalable Computing, 2024, 25 (03): : 1533 - 1540