Supplementing Claims Data with Electronic Medical Records to Improve Estimation and Classification of Rheumatoid Arthritis Disease Activity: A Machine Learning Approach

被引:8
|
作者
Feldman, Candace H. [1 ]
Yoshida, Kazuki [1 ]
Xu, Chang [1 ]
Frits, Michelle L. [1 ]
Shadick, Nancy A. [1 ]
Weinblatt, Michael E. [1 ]
Connolly, Sean E. [2 ]
Alemao, Evo [2 ]
Solomon, Daniel H. [1 ]
机构
[1] Brigham & Womens Hosp, Boston, MA 02115 USA
[2] Bristol Myers Squibb, Princeton, NJ USA
关键词
REGRESSION SHRINKAGE; RECOMMENDATIONS; VALIDATION; SELECTION; LASSO;
D O I
10.1002/acr2.11068
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
ObjectivePrevious attempts to estimate rheumatoid arthritis (RA) disease activity using claims data only did not yield high performance. We aimed to assess whether supplementing claims data with readily available electronic medical record (EMR) data might result in improvement.MethodsWe used a subset of the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study (BRASS) that had linked Medicare claims. The disease activity score in 28 joints with C-reactive protein (DAS28-CRP) was considered the gold standard of measure. Variables in the linked Medicare claims, as well as EMR recorded in the preceding one-year period were used as potential explanatory variables. We constructed three models: "Claims-Only," "Claims + Medications," and "Claims + Medications + Labs (laboratory data from EMR). We selected variables via adaptive LASSO. Model performance was measured with adjusted R2 for continuous DAS28-CRP and C-statistics for binary category classification (high/moderate vs low disease activity/remission).ResultsWe identified 300 patients with laboratory data and linked Medicare claims. The mean age was 68 years and 80% were female. The mean (SD) DAS28-CRP was 3.6 (1.6) and 51% had high or moderate DAS28-CRP. For the continuous estimation, the adjusted R2 was 0.02 for Claims-Only, 0.09 for Claims + Medications, and 0.18 for Claims + Medications + Labs. The C-statistics for discriminating the binary categories were 0.61 for Claims-Only, 0.68 for Claims + Medications, and 0.76 for Claims + Medications + Labs.ConclusionAdding EMR-derived variables to claims-derived variables resulted in modest improvement. Even with EMR variables, we were unable to estimate continuous DAS28-CRP satisfactorily. However, in claims-EMR models, we were able to discriminate between binary categories of disease activity with reasonable accuracy.
引用
下载
收藏
页码:552 / 559
页数:8
相关论文
共 50 条
  • [31] Models solely using claims-based administrative data are poor predictors of rheumatoid arthritis disease activity
    Brian C. Sauer
    Chia-Chen Teng
    Neil A. Accortt
    Zachary Burningham
    David Collier
    Mona Trivedi
    Grant W. Cannon
    Arthritis Research & Therapy, 19
  • [32] Models solely using claims-based administrative data are poor predictors of rheumatoid arthritis disease activity
    Sauer, Brian C.
    Teng, Chia-Chen
    Accortt, Neil A.
    Burningham, Zachary
    Collier, David
    Trivedi, Mona
    Cannon, Grant W.
    ARTHRITIS RESEARCH & THERAPY, 2017, 19
  • [33] Built-in-Electronic-Medical-Record Disease Activity Calculators and Treat-to-Target in Rheumatoid Arthritis
    Jayatilleke, Arundathi
    Pompa, Scott
    ARTHRITIS & RHEUMATOLOGY, 2018, 70
  • [34] Approach to machine learning for extraction of real-world data variables from electronic health records
    Adamson, Blythe
    Waskom, Michael
    Blarre, Auriane
    Kelly, Jonathan
    Krismer, Konstantin
    Nemeth, Sheila
    Gippetti, James
    Ritten, John
    Harrison, Katherine
    Ho, George
    Linzmayer, Robin
    Bansal, Tarun
    Wilkinson, Samuel
    Amster, Guy
    Estola, Evan
    Benedum, Corey M.
    Fidyk, Erin
    Estevez, Melissa
    Shapiro, Will
    Cohen, Aaron B.
    FRONTIERS IN PHARMACOLOGY, 2023, 14
  • [35] Using Electronic Health Records and Machine Learning to Make Medical-Related Predictions from Non-Medical Data
    Pitoglou, Stavros
    Koumpouros, Yiannis
    Anastasiou, Athanasios
    2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA ENGINEERING (ICMLDE 2018), 2018, : 56 - 60
  • [36] Machine Learning Model to Accurately Identify Rheumatoid Arthritis Patients Using Raw Electronic Health Record Data
    Gilvaz, Vinit
    Reginato, Anthony
    Dalal, Deepan
    Crough, Brad
    ARTHRITIS & RHEUMATOLOGY, 2022, 74 : 2766 - 2767
  • [37] PREDICTION OF RESPONSE TO METHOTREXATE IN PATIENTS WITH RHEUMATOID ARTHRITIS: A MACHINE LEARNING APPROACH USING CLINICAL TRIAL DATA
    Duong, S.
    Crowson, C. S.
    Athreya, A.
    Atkinson, E.
    Davis, J. M., III
    Warrington, K. J.
    Matteson, E.
    Weinshilboum, R.
    Wang, L.
    Myasoedova, E.
    ANNALS OF THE RHEUMATIC DISEASES, 2022, 81 : 513 - 514
  • [38] On an Approach of the Solution of Machine Learning Problems Integrated with Data from the Open-Source System of Electronic Medical Records: Application for Fractures Prediction
    Martsenyuk, Vasyl
    Povoroznyuk, Vladyslav
    Semenets, Andriy
    Martynyuk, Larysa
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2019, PT II, 2019, 11509 : 228 - 239
  • [39] A machine learning approach for stratifying risk for food allergies utilizing electronic medical record data
    Landau, Tamar
    Gamrasni, Keren
    Barlev, Yotam
    Elizur, Arnon
    Benor, Shira
    Mimouni, Francis
    Brandwein, Michael
    ALLERGY, 2024, 79 (02) : 499 - 502
  • [40] Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records
    Caccamisi, Andrea
    Jorgensen, Leif
    Dalianis, Hercules
    Rosenlund, Mats
    UPSALA JOURNAL OF MEDICAL SCIENCES, 2020, 125 (04) : 316 - 324