Contextualized race and ethnicity annotations for clinical text from MIMIC-III

被引:1
|
作者
Oliver J. Bear Don’t Walk [1 ]
Adrienne Pichon [2 ]
Harry Reyes Nieva [2 ]
Tony Sun [3 ]
Jaan Li [2 ]
Josh Joseph [4 ]
Sivan Kinberg [5 ]
Lauren R. Richter [3 ]
Salvatore Crusco [6 ]
Kyle Kulas [2 ]
Shaan A. Ahmed [2 ]
Daniel Snyder [2 ]
Ashkon Rahbari [7 ]
Benjamin L. Ranard [2 ]
Pallavi Juneja [2 ]
Dina Demner-Fushman [2 ]
Noémie Elhadad [2 ]
机构
[1] University of Washington,
[2] Columbia University Irving Medical Center,undefined
[3] Harvard Medical School,undefined
[4] One Fact Foundation,undefined
[5] University of Tartu,undefined
[6] Brigham and Women’s Hospital,undefined
[7] NewYork-Presbyterian Hospital,undefined
[8] US National Library of Medicine,undefined
关键词
D O I
10.1038/s41597-024-04183-2
中图分类号
学科分类号
摘要
Observational health research often relies on accurate and complete race and ethnicity (RE) patient information, such as characterizing cohorts, assessing quality/performance metrics of hospitals and health systems, and identifying health disparities. While the electronic health record contains structured data such as accessible patient-level RE data, it is often missing, inaccurate, or lacking granular details. Natural language processing models can be trained to identify RE in clinical text which can supplement missing RE data in clinical data repositories. Here we describe the Contextualized Race and Ethnicity Annotations for Clinical Text (C-REACT) Dataset, which comprises 12,000 patients and 17,281 sentences from their clinical notes in the MIMIC-III dataset. Using these sentences, two sets of reference standard annotations for RE data are made available with annotation guidelines. The first set of annotations comprise highly granular information related to RE, such as preferred language and country of origin, while the second set contains RE labels annotated by physicians. This dataset can support health systems’ ability to use RE data to serve health equity goals.
引用
收藏
相关论文
共 50 条
  • [41] Prevalence of depression by race/ethnicity: Findings from the National Health and Nutrition Examination Survey III
    Riolo, SA
    Nguyen, TA
    Greden, JF
    King, CA
    AMERICAN JOURNAL OF PUBLIC HEALTH, 2005, 95 (06) : 998 - 1000
  • [42] Development and Internal Validation of a Nomogram to Predict Mortality During the ICU Stay of Thoracic Fracture Patients Without Neurological Compromise: An Analysis of the MIMIC-III Clinical Database
    Wang, Haosheng
    Ou, Yangyang
    Fan, Tingting
    Zhao, Jianwu
    Kang, Mingyang
    Dong, Rongpeng
    Qu, Yang
    FRONTIERS IN PUBLIC HEALTH, 2021, 9
  • [43] Association Between Hemoglobin-to-Red Blood Cell Distribution Width Ratio and 30-Day Mortality in Patients with Acute Pancreatitis: Data from MIMIC-III and MIMIC-IV
    Xiong, Jihao
    Tan, Hongchun
    Mao, Shanlin
    Ma, Lingfang
    Ma, Ke
    TURKISH JOURNAL OF GASTROENTEROLOGY, 2024, 35 (08): : 651 - 664
  • [44] An Evaluation of Race and Ethnicity-Based Representation Among Participants in Phase III Clinical Trials for Multiple Myeloma
    Wang, Tianyi
    Villanueva, Dinorah J.
    Gifkins, Dina
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 : 96 - 97
  • [45] Optimal Targets of the First 24-h Partial Pressure of Carbon Dioxide in Patients with Cerebral Injury: Data from the MIMIC-III and IV Database
    Cai, Gengxin
    Zhang, Xiunong
    Ou, Qitian
    Zhou, Yuan
    Huang, Linqiang
    Chen, Shenglong
    Zeng, Hongke
    Jiang, Wenqiang
    Wen, Miaoyun
    NEUROCRITICAL CARE, 2022, 36 (02) : 412 - 420
  • [46] Development and validation of a diagnostic prediction model for severe periventricular-intraventricular hemorrhage in newborns: insights from a retrospective analysis utilizing the MIMIC-III database
    Deng, Zhiyue
    Tang, Jiaxin
    Fang, Chengzhi
    Zhang, Bing -Hong
    JORNAL DE PEDIATRIA, 2024, 100 (03) : 327 - 334
  • [47] Serum anion gap at admission predicts all-cause mortality in critically ill patients with cerebral infarction: evidence from the MIMIC-III database
    Liu, Xuefang
    Feng, Yanlin
    Zhu, Xinyu
    Shi, Ying
    Lin, Manting
    Song, Xiaoyan
    Tu, Jiancheng
    Yuan, Enwu
    BIOMARKERS, 2020, 25 (08) : 725 - 732
  • [48] Optimal Targets of the First 24-h Partial Pressure of Carbon Dioxide in Patients with Cerebral Injury: Data from the MIMIC-III and IV Database
    Gengxin Cai
    Xiunong Zhang
    Qitian Ou
    Yuan Zhou
    Linqiang Huang
    Shenglong Chen
    Hongke Zeng
    Wenqiang Jiang
    Miaoyun Wen
    Neurocritical Care, 2022, 36 : 412 - 420
  • [49] Admission white blood cell count predicts post-discharge mortality in patients with acute aortic dissection: data from the MIMIC-III database
    Chiyuan Zhang
    Zuli Fu
    Hui Bai
    Guoqiang Lin
    Ruizheng Shi
    Xuliang Chen
    Qian Xu
    BMC Cardiovascular Disorders, 21
  • [50] Admission white blood cell count predicts post-discharge mortality in patients with acute aortic dissection: data from the MIMIC-III database
    Zhang, Chiyuan
    Fu, Zuli
    Bai, Hui
    Lin, Guoqiang
    Shi, Ruizheng
    Chen, Xuliang
    Xu, Qian
    BMC CARDIOVASCULAR DISORDERS, 2021, 21 (01)