Supplementing Claims Data with Electronic Medical Records to Improve Estimation and Classification of Rheumatoid Arthritis Disease Activity: A Machine Learning Approach

被引:8
|
作者
Feldman, Candace H. [1 ]
Yoshida, Kazuki [1 ]
Xu, Chang [1 ]
Frits, Michelle L. [1 ]
Shadick, Nancy A. [1 ]
Weinblatt, Michael E. [1 ]
Connolly, Sean E. [2 ]
Alemao, Evo [2 ]
Solomon, Daniel H. [1 ]
机构
[1] Brigham & Womens Hosp, Boston, MA 02115 USA
[2] Bristol Myers Squibb, Princeton, NJ USA
关键词
REGRESSION SHRINKAGE; RECOMMENDATIONS; VALIDATION; SELECTION; LASSO;
D O I
10.1002/acr2.11068
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
ObjectivePrevious attempts to estimate rheumatoid arthritis (RA) disease activity using claims data only did not yield high performance. We aimed to assess whether supplementing claims data with readily available electronic medical record (EMR) data might result in improvement.MethodsWe used a subset of the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study (BRASS) that had linked Medicare claims. The disease activity score in 28 joints with C-reactive protein (DAS28-CRP) was considered the gold standard of measure. Variables in the linked Medicare claims, as well as EMR recorded in the preceding one-year period were used as potential explanatory variables. We constructed three models: "Claims-Only," "Claims + Medications," and "Claims + Medications + Labs (laboratory data from EMR). We selected variables via adaptive LASSO. Model performance was measured with adjusted R2 for continuous DAS28-CRP and C-statistics for binary category classification (high/moderate vs low disease activity/remission).ResultsWe identified 300 patients with laboratory data and linked Medicare claims. The mean age was 68 years and 80% were female. The mean (SD) DAS28-CRP was 3.6 (1.6) and 51% had high or moderate DAS28-CRP. For the continuous estimation, the adjusted R2 was 0.02 for Claims-Only, 0.09 for Claims + Medications, and 0.18 for Claims + Medications + Labs. The C-statistics for discriminating the binary categories were 0.61 for Claims-Only, 0.68 for Claims + Medications, and 0.76 for Claims + Medications + Labs.ConclusionAdding EMR-derived variables to claims-derived variables resulted in modest improvement. Even with EMR variables, we were unable to estimate continuous DAS28-CRP satisfactorily. However, in claims-EMR models, we were able to discriminate between binary categories of disease activity with reasonable accuracy.
引用
收藏
页码:552 / 559
页数:8
相关论文
共 50 条
  • [41] Phenotyping people with a history of injecting drug use within electronic medical records using an interactive machine learning approach
    El-Hayek, Carol
    Nguyen, Thi
    Hellard, Margaret E.
    Curtis, Michael
    Sacks-Davis, Rachel
    Aung, Htein Linn
    Asselin, Jason
    Boyle, Douglas I. R.
    Wilkinson, Anna
    Polkinghorne, Victoria
    Hocking, Jane S.
    Dunn, Adam G.
    npj Digital Medicine, 2024, 7 (01)
  • [42] Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage
    Tang, Jianxiang
    Wang, Xiaoyu
    Wan, Hongli
    Lin, Chunying
    Shao, Zilun
    Chang, Yang
    Wang, Hexuan
    Wu, Yi
    Zhang, Tao
    Du, Yu
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
  • [43] Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage
    Jianxiang Tang
    Xiaoyu Wang
    Hongli Wan
    Chunying Lin
    Zilun Shao
    Yang Chang
    Hexuan Wang
    Yi Wu
    Tao Zhang
    Yu Du
    BMC Medical Informatics and Decision Making, 22
  • [44] Clinical predictors of response to methotrexate in patients with rheumatoid arthritis: a machine learning approach using clinical trial data
    Stephanie Q. Duong
    Cynthia S. Crowson
    Arjun Athreya
    Elizabeth J. Atkinson
    John M. Davis
    Kenneth J. Warrington
    Eric L. Matteson
    Richard Weinshilboum
    Liewei Wang
    Elena Myasoedova
    Arthritis Research & Therapy, 24
  • [45] Clinical predictors of response to methotrexate in patients with rheumatoid arthritis: a machine learning approach using clinical trial data
    Duong, Stephanie Q.
    Crowson, Cynthia S.
    Athreya, Arjun
    Atkinson, Elizabeth J.
    Davis, John M.
    Warrington, Kenneth J.
    Matteson, Eric L.
    Weinshilboum, Richard
    Wang, Liewei
    Myasoedova, Elena
    ARTHRITIS RESEARCH & THERAPY, 2022, 24 (01)
  • [46] Ordinal labels in machine learning: a user-centered approach to improve data validity in medical settings
    Andrea Seveso
    Andrea Campagner
    Davide Ciucci
    Federico Cabitza
    BMC Medical Informatics and Decision Making, 20
  • [47] Ordinal labels in machine learning: a user-centered approach to improve data validity in medical settings
    Seveso, Andrea
    Campagner, Andrea
    Ciucci, Davide
    Cabitza, Federico
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (Suppl 5)
  • [48] Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records
    Zheyi Dong
    Qian Wang
    Yujing Ke
    Weiguang Zhang
    Quan Hong
    Chao Liu
    Xiaomin Liu
    Jian Yang
    Yue Xi
    Jinlong Shi
    Li Zhang
    Ying Zheng
    Qiang Lv
    Yong Wang
    Jie Wu
    Xuefeng Sun
    Guangyan Cai
    Shen Qiao
    Chengliang Yin
    Shibin Su
    Xiangmei Chen
    Journal of Translational Medicine, 20
  • [49] Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records
    Dong, Zheyi
    Wang, Qian
    Ke, Yujing
    Zhang, Weiguang
    Hong, Quan
    Liu, Chao
    Liu, Xiaomin
    Yang, Jian
    Xi, Yue
    Shi, Jinlong
    Zhang, Li
    Zheng, Ying
    Lv, Qiang
    Wang, Yong
    Wu, Jie
    Sun, Xuefeng
    Cai, Guangyan
    Qiao, Shen
    Yin, Chengliang
    Su, Shibin
    Chen, Xiangmei
    JOURNAL OF TRANSLATIONAL MEDICINE, 2022, 20 (01)
  • [50] Machine Learning Analysis for Data Incompleteness (MADI): Analyzing the Data Completeness of Patient Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records
    Gurupur, Varadraj P.
    Shelleh, Muhammed
    IEEE ACCESS, 2021, 9 : 95994 - 96001