Contextualized race and ethnicity annotations for clinical text from MIMIC-III

被引:1
|
作者
Oliver J. Bear Don’t Walk [1 ]
Adrienne Pichon [2 ]
Harry Reyes Nieva [2 ]
Tony Sun [3 ]
Jaan Li [2 ]
Josh Joseph [4 ]
Sivan Kinberg [5 ]
Lauren R. Richter [3 ]
Salvatore Crusco [6 ]
Kyle Kulas [2 ]
Shaan A. Ahmed [2 ]
Daniel Snyder [2 ]
Ashkon Rahbari [7 ]
Benjamin L. Ranard [2 ]
Pallavi Juneja [2 ]
Dina Demner-Fushman [2 ]
Noémie Elhadad [2 ]
机构
[1] University of Washington,
[2] Columbia University Irving Medical Center,undefined
[3] Harvard Medical School,undefined
[4] One Fact Foundation,undefined
[5] University of Tartu,undefined
[6] Brigham and Women’s Hospital,undefined
[7] NewYork-Presbyterian Hospital,undefined
[8] US National Library of Medicine,undefined
关键词
D O I
10.1038/s41597-024-04183-2
中图分类号
学科分类号
摘要
Observational health research often relies on accurate and complete race and ethnicity (RE) patient information, such as characterizing cohorts, assessing quality/performance metrics of hospitals and health systems, and identifying health disparities. While the electronic health record contains structured data such as accessible patient-level RE data, it is often missing, inaccurate, or lacking granular details. Natural language processing models can be trained to identify RE in clinical text which can supplement missing RE data in clinical data repositories. Here we describe the Contextualized Race and Ethnicity Annotations for Clinical Text (C-REACT) Dataset, which comprises 12,000 patients and 17,281 sentences from their clinical notes in the MIMIC-III dataset. Using these sentences, two sets of reference standard annotations for RE data are made available with annotation guidelines. The first set of annotations comprise highly granular information related to RE, such as preferred language and country of origin, while the second set contains RE labels annotated by physicians. This dataset can support health systems’ ability to use RE data to serve health equity goals.
引用
收藏
相关论文
共 50 条
  • [21] Impact of Pulmonary Arterial Hypertension on Systemic Inflammation, Cardiac Injury and Hemodynamics in Sepsis: A Retrospective Study From MIMIC-III
    He, Wencheng
    Zhang, Weixing
    An, Youzhong
    Huang, Lei
    Luo, Hua
    AMERICAN JOURNAL OF THE MEDICAL SCIENCES, 2022, 363 (04): : 311 - 321
  • [22] Effect of Admission Serum Calcium Levels and Length of Stay in Patients with Acute Pancreatitis: Data from the MIMIC-III Database
    Wang, Dongyan
    Guo, Xiaoyan
    Xia, Wenwen
    Ru, Zhijuan
    Shi, Yihai
    Hu, Zhengyu
    EMERGENCY MEDICINE INTERNATIONAL, 2022, 2022
  • [23] Comparative effectiveness and safety of bolus vs. continuous infusion of loop diuretics: Results from the MIMIC-III Database
    Weng, Haoyu
    Li, Yuxi
    Nie, Xiaolu
    He, Chunhui
    Feng, Pengbin
    Zhao, Fengxin
    Chen, Qingjie
    Sun, Wen
    Jiang, Jie
    Zhang, Yan
    Huo, Yong
    Li, Jianping
    AMERICAN JOURNAL OF THE MEDICAL SCIENCES, 2023, 365 (04): : 353 - 360
  • [24] Novel methods of predicting ionized calcium status from routine data in critical care: External validation in MIMIC-III
    Yap, Ernie
    Ouyang, Jie
    Puri, Isha
    Melaku, Yohannes
    Goldwasser, Philip
    CLINICA CHIMICA ACTA, 2022, 531 : 375 - 381
  • [25] Reporting and representation of participant race and ethnicity in phase III clinical trials for solid tumors
    Wang, Tianyi
    Villanueva, Dinorah J.
    Banerjee, Ambily
    Gifkins, Dina
    FUTURE SCIENCE OA, 2025, 11 (01):
  • [26] Machine learning algorithms for prediction of ventilator associated pneumonia in traumatic brain injury patients from the MIMIC-III database
    Wang, Ruoran
    Cai, Linrui
    Liu, Yan
    Zhang, Jing
    Ou, Xiaofeng
    Xu, Jianguo
    HEART & LUNG, 2023, 62 : 225 - 232
  • [27] Effect of Admission Serum Calcium Levels and Length of Stay in Patients with Acute Pancreatitis: Data from the MIMIC-III Database
    Wang, Dongyan
    Guo, Xiaoyan
    Xia, Wenwen
    Ru, Zhijuan
    Shi, Yihai
    Hu, Zhengyu
    EMERGENCY MEDICINE INTERNATIONAL, 2022, 2022
  • [28] Prolonged Elevated Heart Rate and 90-Day Survival in Acutely Ill Patients: Data From the MIMIC-III Database
    Sandfort, Veit
    Johnson, Alistair E. W.
    Kunz, Lauren M.
    Vargas, Jose D.
    Rosing, Douglas R.
    JOURNAL OF INTENSIVE CARE MEDICINE, 2019, 34 (08) : 622 - 629
  • [29] Diuretic strategies in patients with resistance to loop-diuretics in the intensive care unit: A retrospective study from the MIMIC-III database
    Cote, Jean-Maxime
    Bouchard, Josee
    Murray, Patrick T.
    Beaubien-Souligny, William
    JOURNAL OF CRITICAL CARE, 2021, 65 : 282 - 291
  • [30] Mortality prediction among ICU inpatients based on MIMIC-III database results from the conditional medical generative adversarial network
    Yang, Wei
    Zou, Hong
    Wang, Meng
    Zhang, Qin
    Li, Shadan
    Liang, Hongyin
    HELIYON, 2023, 9 (02)