Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening

被引:9
|
作者
Cai, Tianrun [1 ]
Cai, Fiona [2 ]
Dahal, Kumar P. [1 ]
Cremone, Gabrielle [1 ]
Lam, Ethan [1 ]
Golnik, Charlotte [1 ]
Seyok, Thany [1 ]
Hong, Chuan [3 ]
Cai, Tianxi [3 ]
Liao, Katherine P. [4 ,5 ]
机构
[1] Brigham & Womens Hosp, Boston, MA 02115 USA
[2] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] Harvard Univ, Boston, MA 02115 USA
[4] Harvard Univ, Brigham & Womens Hosp, Boston, MA 02115 USA
[5] Vet Affairs Boston Healthcare Syst, Boston, MA USA
基金
美国国家卫生研究院;
关键词
COST;
D O I
10.1002/acr2.11289
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objective Efficiently identifying eligible patients is a crucial first step for a successful clinical trial. The objective of this study was to test whether an approach using electronic health record (EHR) data and an ensemble machine learning algorithm incorporating billing codes and data from clinical notes processed by natural language processing (NLP) can improve the efficiency of eligibility screening. Methods We studied patients screened for a clinical trial of rheumatoid arthritis (RA) with one or more International Classification of Diseases (ICD) code for RA and age greater than 35 years, from a tertiary care center and a community hospital. The following three groups of EHR features were considered for the algorithm: 1) structured features, 2) the counts of NLP concepts from notes, 3) health care utilization. All features were linked to dates. We applied random forest and logistic regression with least absolute shrinkage and selection operator penalty against the following two standard approaches: 1) one or more RA ICD code and no ICD codes related to exclusion criteria (Screen(RAICD1)(+EX)) and 2) two or more RA ICD codes (Screen(RAICD2)). To test the portability, we trained the algorithm at one institution and tested it at the other. Results In total, 3359 patients at Brigham and Women's Hospital (BWH) and 642 patients at Faulkner Hospital (FH) were studied, with 461 (13.7%) eligible patients at BWH and 84 (13.4%) at FH. The application of the algorithm reduced ineligible patients from chart review by 40.5% at the tertiary care center and by 57.0% at the community hospital. In contrast, Screen(RAICD2) reduced patients for chart review by 2.7% to 11.3%; Screen(RAICD1+EX) reduced patients for chart review by 63% to 65% but excluded 22% to 27% of eligible patients. Conclusion The ensemble machine learning algorithm incorporating billing codes and NLP data increased the efficiency of eligibility screening by reducing the number of patients requiring chart review while not excluding eligible patients. Moreover, this approach can be trained at one institution and applied at another for multicenter clinical trials.
引用
收藏
页码:593 / 600
页数:8
相关论文
共 50 条
  • [1] Improving the Efficiency of Clinical Trial Recruitment Using Electronic Health Record Data, Natural Language Processing, and Machine Learning
    Cai, Tianrun
    Cai, Fiona
    Dahal, Kumar
    Hong, Chuan
    Liao, Katherine
    [J]. ARTHRITIS & RHEUMATOLOGY, 2019, 71
  • [2] An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation
    Zeng, Kun
    Pan, Zhiwei
    Xu, Yibin
    Qu, Yingying
    [J]. JMIR MEDICAL INFORMATICS, 2020, 8 (07)
  • [3] Electronic Screening Improves Efficiency in Clinical Trial Recruitment
    Thadani, Samir R.
    Weng, Chunhua
    Bigger, J. Thomas
    Ennever, John F.
    Wajngurt, David
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2009, 16 (06) : 869 - 873
  • [4] Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study
    Callender, Thomas
    Imrie, Fergus
    Cebere, Bogdan
    Pashayan, Nora
    Navani, Neal
    Van der Schaar, Mihaela
    Janes, Sam M.
    [J]. PLOS MEDICINE, 2023, 20 (10)
  • [5] The accuracy and efficiency of electronic screening for recruitment into a clinical trial on COPD
    Schmickl, Christopher N.
    Li, Man
    Li, Guangxi
    Wetzstein, Marnie M.
    Herasevich, Vitaly
    Gajic, Ognjen
    Benzo, Roberto P.
    [J]. RESPIRATORY MEDICINE, 2011, 105 (10) : 1501 - 1506
  • [6] Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning
    Kun Zeng
    Yibin Xu
    Ge Lin
    Likeng Liang
    Tianyong Hao
    [J]. BMC Medical Informatics and Decision Making, 21
  • [7] Improving clinical trial efficiency using a machine learning-based risk score to enrich study populations
    Jering, Karola S.
    Campagnari, Claudio
    Claggett, Brian
    Adler, Eric
    Klein, Liviu
    Ahmad, Faraz S.
    Voors, Adriaan A.
    Solomon, Scott
    Yagil, Avi
    Greenberg, Barry
    [J]. EUROPEAN JOURNAL OF HEART FAILURE, 2022, 24 (08) : 1418 - 1426
  • [8] Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning
    Zeng, Kun
    Xu, Yibin
    Lin, Ge
    Liang, Likeng
    Hao, Tianyong
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (SUPPL 2)
  • [9] Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning
    Ricci-Lopez, Joel
    Aguila, Sergio A.
    Gilson, Michael K.
    Brizuela, Carlos A.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (11) : 5362 - 5376
  • [10] Machine Learning Prediction of Clinical Trial Operational Efficiency
    Wu, Kevin
    Wu, Eric
    DAndrea, Michael
    Chitale, Nandini
    Lim, Melody
    Dabrowski, Marek
    Kantor, Klaudia
    Rangi, Hanoor
    Liu, Ruishan
    Garmhausen, Marius
    Pal, Navdeep
    Harbron, Chris
    Rizzo, Shemra
    Copping, Ryan
    Zou, James
    [J]. AAPS JOURNAL, 2022, 24 (03):