Supplementing Claims Data with Electronic Medical Records to Improve Estimation and Classification of Rheumatoid Arthritis Disease Activity: A Machine Learning Approach

被引:8
|
作者
Feldman, Candace H. [1 ]
Yoshida, Kazuki [1 ]
Xu, Chang [1 ]
Frits, Michelle L. [1 ]
Shadick, Nancy A. [1 ]
Weinblatt, Michael E. [1 ]
Connolly, Sean E. [2 ]
Alemao, Evo [2 ]
Solomon, Daniel H. [1 ]
机构
[1] Brigham & Womens Hosp, Boston, MA 02115 USA
[2] Bristol Myers Squibb, Princeton, NJ USA
关键词
REGRESSION SHRINKAGE; RECOMMENDATIONS; VALIDATION; SELECTION; LASSO;
D O I
10.1002/acr2.11068
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
ObjectivePrevious attempts to estimate rheumatoid arthritis (RA) disease activity using claims data only did not yield high performance. We aimed to assess whether supplementing claims data with readily available electronic medical record (EMR) data might result in improvement.MethodsWe used a subset of the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study (BRASS) that had linked Medicare claims. The disease activity score in 28 joints with C-reactive protein (DAS28-CRP) was considered the gold standard of measure. Variables in the linked Medicare claims, as well as EMR recorded in the preceding one-year period were used as potential explanatory variables. We constructed three models: "Claims-Only," "Claims + Medications," and "Claims + Medications + Labs (laboratory data from EMR). We selected variables via adaptive LASSO. Model performance was measured with adjusted R2 for continuous DAS28-CRP and C-statistics for binary category classification (high/moderate vs low disease activity/remission).ResultsWe identified 300 patients with laboratory data and linked Medicare claims. The mean age was 68 years and 80% were female. The mean (SD) DAS28-CRP was 3.6 (1.6) and 51% had high or moderate DAS28-CRP. For the continuous estimation, the adjusted R2 was 0.02 for Claims-Only, 0.09 for Claims + Medications, and 0.18 for Claims + Medications + Labs. The C-statistics for discriminating the binary categories were 0.61 for Claims-Only, 0.68 for Claims + Medications, and 0.76 for Claims + Medications + Labs.ConclusionAdding EMR-derived variables to claims-derived variables resulted in modest improvement. Even with EMR variables, we were unable to estimate continuous DAS28-CRP satisfactorily. However, in claims-EMR models, we were able to discriminate between binary categories of disease activity with reasonable accuracy.
引用
收藏
页码:552 / 559
页数:8
相关论文
共 50 条
  • [1] Development and validation of a rheumatoid arthritis case definition: a machine learning approach using data from primary care electronic medical records
    Anh N. Q. Pham
    Claire E. H. Barber
    Neil Drummond
    Lisa Jasper
    Doug Klein
    Cliff Lindeman
    Jessica Widdifield
    Tyler Williamson
    C. Allyson Jones
    [J]. BMC Medical Informatics and Decision Making, 24 (1)
  • [2] Automatic Prediction of Rheumatoid Arthritis Disease Activity from the Electronic Medical Records
    Lin, Chen
    Karlson, Elizabeth W.
    Canhao, Helena
    Miller, Timothy A.
    Dligach, Dmitriy
    Chen, Pei Jun
    Perez, Raul Natanael Guzman
    Shen, Yuanyan
    Weinblatt, Michael E.
    Shadick, Nancy A.
    Plenge, Robert M.
    Savova, Guergana K.
    [J]. PLOS ONE, 2013, 8 (08):
  • [3] Defining Disease Phenotypes in Primary Care Electronic Health Records by a Machine Learning Approach: A Case Study in Identifying Rheumatoid Arthritis
    Zhou, Shang-Ming
    Fernandez-Gutierrez, Fabiola
    Kennedy, Jonathan
    Cooksey, Roxanne
    Atkinson, Mark
    Denaxas, Spiros
    Siebert, Stefan
    Dixon, William G.
    O'Neill, Terence W.
    Choy, Ernest
    Sudlow, Cathie
    Brophy, Sinead
    [J]. PLOS ONE, 2016, 11 (05):
  • [4] Approach to Addressing Missing Data for Electronic Medical Records and Pharmacy Claims Data Research
    Bounthavong, Mark
    Watanabe, Jonathan H.
    Sullivan, Kevin M.
    [J]. PHARMACOTHERAPY, 2015, 35 (04): : 380 - 387
  • [5] Using machine learning to improve anaphylaxis case identification in medical claims data
    Kural, Kamil Can
    Mazo, Ilya
    Walderhaug, Mark
    Santana-Quintero, Luis
    Karagiannis, Konstantinos
    Thompson, Elaine E.
    Kelman, Jeffrey A.
    Goud, Ravi
    [J]. JAMIA OPEN, 2023, 6 (04)
  • [6] Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis
    Spencer, Alison K.
    Bandaria, Jigar
    Leavy, Michelle B.
    Gliklich, Benjamin
    Su, Zhaohui
    Curhan, Gary
    Boussios, Costas
    [J]. RMD OPEN, 2021, 7 (03):
  • [7] Data Analytics and Machine Learning for Disease Identification in Electronic Health Records
    Benke, Kurt K.
    [J]. JAMA OPHTHALMOLOGY, 2019, 137 (05) : 497 - 498
  • [8] MACHINE LEARNING APPROACH FOR CLASSIFICATION OF ARTHRITIS ACTIVITY STATE, USING DATA FROM A SINGLE ACCELEROMETER
    Mielnik, P.
    Hjelle, A. Myhre
    Traseth, A.
    Tokarz, K.
    Pollen, B.
    Fojcik, M.
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2023, 82 : 2012 - 2012
  • [9] Text Classification Model in Chinese Electronic Medical Records Using Machine Learning Methods
    Zhang, Ping
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 127 : 123 - 123
  • [10] Cognitive performance classification of older patients using machine learning and electronic medical records
    Monika Richter-Laskowska
    Ewelina Sobotnicka
    Adam Bednorz
    [J]. Scientific Reports, 15 (1)