Supplementing Claims Data with Electronic Medical Records to Improve Estimation and Classification of Rheumatoid Arthritis Disease Activity: A Machine Learning Approach

被引:8
|
作者
Feldman, Candace H. [1 ]
Yoshida, Kazuki [1 ]
Xu, Chang [1 ]
Frits, Michelle L. [1 ]
Shadick, Nancy A. [1 ]
Weinblatt, Michael E. [1 ]
Connolly, Sean E. [2 ]
Alemao, Evo [2 ]
Solomon, Daniel H. [1 ]
机构
[1] Brigham & Womens Hosp, Boston, MA 02115 USA
[2] Bristol Myers Squibb, Princeton, NJ USA
关键词
REGRESSION SHRINKAGE; RECOMMENDATIONS; VALIDATION; SELECTION; LASSO;
D O I
10.1002/acr2.11068
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
ObjectivePrevious attempts to estimate rheumatoid arthritis (RA) disease activity using claims data only did not yield high performance. We aimed to assess whether supplementing claims data with readily available electronic medical record (EMR) data might result in improvement.MethodsWe used a subset of the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study (BRASS) that had linked Medicare claims. The disease activity score in 28 joints with C-reactive protein (DAS28-CRP) was considered the gold standard of measure. Variables in the linked Medicare claims, as well as EMR recorded in the preceding one-year period were used as potential explanatory variables. We constructed three models: "Claims-Only," "Claims + Medications," and "Claims + Medications + Labs (laboratory data from EMR). We selected variables via adaptive LASSO. Model performance was measured with adjusted R2 for continuous DAS28-CRP and C-statistics for binary category classification (high/moderate vs low disease activity/remission).ResultsWe identified 300 patients with laboratory data and linked Medicare claims. The mean age was 68 years and 80% were female. The mean (SD) DAS28-CRP was 3.6 (1.6) and 51% had high or moderate DAS28-CRP. For the continuous estimation, the adjusted R2 was 0.02 for Claims-Only, 0.09 for Claims + Medications, and 0.18 for Claims + Medications + Labs. The C-statistics for discriminating the binary categories were 0.61 for Claims-Only, 0.68 for Claims + Medications, and 0.76 for Claims + Medications + Labs.ConclusionAdding EMR-derived variables to claims-derived variables resulted in modest improvement. Even with EMR variables, we were unable to estimate continuous DAS28-CRP satisfactorily. However, in claims-EMR models, we were able to discriminate between binary categories of disease activity with reasonable accuracy.
引用
下载
收藏
页码:552 / 559
页数:8
相关论文
共 50 条
  • [21] Deep and machine learning models to improve risk prediction of cardiovascular disease using data extraction from electronic health records
    Korsakov, I.
    Gusev, A.
    Kuznetsova, T.
    Gavrilov, D.
    Novitskiy, R.
    EUROPEAN HEART JOURNAL, 2019, 40 : 1213 - 1213
  • [22] Identifying Risk Factors For Heart Disease in Electronic Medical Records: A Deep Learning Approach
    Chokwijitkul, Thanat
    Nguyen, Anthony
    Hassanzadeh, Hamed
    Perez, Siegfried
    SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2018), 2018, : 18 - 27
  • [23] Predicting Disease Activity in Rheumatoid Arthritis Patients Using Machine Learning: Data from the ACR's RISE Registry
    Tshimanga, Eren-Ajani
    Gianfrancesco, Milena
    Giampanis, Stefanos
    Li, Jing
    Kersey, Emma
    Yazdany, Jinoos
    Norgeot, Beau
    Schmajuk, Gabriela
    Izadi, Zara
    ARTHRITIS & RHEUMATOLOGY, 2022, 74 : 1789 - 1790
  • [24] Models Using Claims-Based Administrative Data Are Poor Predictors of Rheumatoid Arthritis Disease Activity in VA Rheumatoid Arthritis (VARA) Patients
    Sauer, Brian
    Teng, Chia-Chen
    Accortt, Neil
    Burningham, Zachary
    Collier, David
    Trivedi, Mona
    Cannon, Grant W.
    ARTHRITIS & RHEUMATOLOGY, 2016, 68
  • [25] Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes
    Desai, Rishi J.
    Wang, Shirley V.
    Vaduganathan, Muthiah
    Evers, Thomas
    Schneeweiss, Sebastian
    JAMA NETWORK OPEN, 2020, 3 (01)
  • [26] Extraction of Rheumatoid Arthritis Disease Activity Measures From Electronic Health Records Using Automated Processing Algorithms
    Cannon, Grant W.
    Rojas, Jorge
    Reimold, Andreas
    Mikuls, Ted R.
    Bergman, Debra
    Sauer, Brian C.
    ACR OPEN RHEUMATOLOGY, 2019, 1 (10) : 632 - 639
  • [27] Physician Ability to Assess Rheumatoid Arthritis Disease Activity Using an Electronic Medical Record-Based Disease Activity Calculator
    Collier, Deborah S.
    Grant, Richard W.
    Estey, Greg
    Surrao, Dominic
    Chueh, Henry C.
    Kay, Jonathan
    ARTHRITIS & RHEUMATISM-ARTHRITIS CARE & RESEARCH, 2009, 61 (04): : 495 - 500
  • [28] Implementation and Use of Disease Diagnosis Systems for Electronic Medical Records Based on Machine Learning: A Complete Review
    Latif, Jahanzaib
    Xiao, Chuangbai
    Tu, Shanshan
    Rehman, Sadaqat Ur
    Imran, Azhar
    Bilal, Anas
    IEEE ACCESS, 2020, 8 : 150489 - 150513
  • [29] Evaluation of a Methodological Approach to Determine Timing of Rheumatoid Arthritis Disease Onset Using Administrative Claims Data.
    Zhang, Jie
    Xie, Fenglong
    Chen, Lang
    Greenberg, Jeffrey D.
    Curtis, Jeffrey R.
    ARTHRITIS & RHEUMATOLOGY, 2014, 66 : S504 - S504
  • [30] Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach
    Vaid, Akhil
    Jaladanki, Suraj K.
    Xu, Jie
    Teng, Shelly
    Kumar, Arvind
    Lee, Samuel
    Somani, Sulaiman
    Paranjpe, Ishan
    De Freitas, Jessica K.
    Wanyan, Tingyi
    Johnson, Kipp W.
    Bicak, Mesude
    Klang, Eyal
    Kwon, Young Joon
    Costa, Anthony
    Zhao, Shan
    Miotto, Riccardo
    Charney, Alexander W.
    Boettinger, Erwin
    Fayad, Zahi A.
    Nadkarni, Girish N.
    Wang, Fei
    Glicksberg, Benjamin S.
    JMIR MEDICAL INFORMATICS, 2021, 9 (01)