Identification of potentially undiagnosed patients with nontuberculous mycobacteria lung disease using machine learning applied to primary care data in the UK

被引:22
|
作者
Doyle, Orla M. [1 ]
van der Laan, Roald [2 ]
Obradovic, Marko [2 ]
McMahon, Peter [3 ]
Daniels, Flora [4 ]
Pitcher, Ashley [5 ]
Loebinger, Michael R. [6 ,7 ]
机构
[1] IQVIA, Real World Analyt Solut, Predict Analyt, London, England
[2] Insmed Utrecht, Utrecht, Netherlands
[3] IQVIA, Real World Insights, London, England
[4] IQVIA, Real World Insights, Basel, Switzerland
[5] IQVIA, Real World Insights, Copenhagen, Denmark
[6] Royal Brompton & Harefield NHS Fdn Trust, London, England
[7] Imperial Coll London, London, England
关键词
PULMONARY-DISEASE; FUNCTION DECLINE; MORTALITY; INFECTIONS; GERMANY; RISK; COPD;
D O I
10.1183/13993003.00045-2020
中图分类号
R56 [呼吸系及胸部疾病];
学科分类号
摘要
Nontuberculous mycobacterial lung disease (NTMLD) is a rare lung disease often missed due to a low index of suspicion and unspecific clinical presentation. This retrospective study was designed to characterise the prediagnosis features of NTMLD patients in primary care and to assess the feasibility of using machine learning to identify undiagnosed NTMLD patients. IQVIA Medical Research Data (incorporating THIN, a Cegedim Database), a UK electronic medical records primary care database was used. NTMLD patients were identified between 2003 and 2017 by diagnosis in primary or secondary care or record of NTMLD treatment regimen. Risk factors and treatments were extracted in the prediagnosis period, guided by literature and expert clinical opinion. The control population was enriched to have at least one of these features. 741 NTMLD and 112 784 control patients were selected. Annual prevalence rates of NTMLD from 2006 to 2016 increased from 2.7 to 5.1 per 100000. The most common pre-existing diagnoses and treatments for NTMLD patients were COPD and asthma and penicillin, macrolides and inhaled corticosteroids. Compared to random testing, machine learning improved detection of patients with NTMLD by almost a thousand-fold with AUC of 0.94. The total prevalence of diagnosed and undiagnosed cases of NTMLD in 2016 was estimated to range between 9 and 16 per 100000. This study supports the feasibility of machine learning applied to primary care data to screen for undiagnosed NTMLD patients, with results indicating that there may be a substantial number of undiagnosed cases of NTMLD in the UK.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] USING MACHINE LEARNING TO DETECT PATIENTS WITH UNDIAGNOSED RARE DISEASES: AN APPLICATION OF SUPPORT VECTOR MACHINES TO A RARE ONCOLOGY DISEASE
    Rigg, J.
    Lodhi, H.
    Nasuti, P.
    VALUE IN HEALTH, 2015, 18 (07) : A705 - A705
  • [32] Machine learning techniques applied to forecast Black Sigatoka disease development rate using meteorological data
    Calvo-Valverde, Luis-Alexander
    Guzman Quesada, Mauricio
    Guzman Alvarez, Jose-Antonio
    Alvarado-Moya, Pablo
    PHYTOPATHOLOGY, 2017, 107 (07) : 16 - 16
  • [33] Identification of disease-associated loci using machine learning for genotype and network data integration
    Leal, Luis G.
    David, Alessia
    Jarvelin, Marjo-Riita
    Sebert, Sylvain
    Mannikko, Minna
    Karhunen, Ville
    Seaby, Eleanor
    Hoggart, Clive
    Sternberg, Michael J. E.
    BIOINFORMATICS, 2019, 35 (24) : 5182 - 5190
  • [34] Improving triaging from primary care into secondary care using heterogeneous data-driven hybrid machine learning
    Wang, Bing
    Li, Weizi
    Bradlow, Anthony
    Bazuaye, Eghosa
    Chan, Antoni T. Y.
    DECISION SUPPORT SYSTEMS, 2023, 166
  • [35] PREDICTIVE MODELING TO IDENTIFY UNDIAGNOSED PATIENTS WITH ALPHA-1 ANTITRYPSIN DEFICIENCY USING MACHINE LEARNING AND MEDICAL CLAIMS DATA
    Colbaugh, Richard
    Glass, Kristin
    Himmelhan, Iris
    Hinson, Jimmy
    Sanchirico, Marie
    CHEST, 2022, 162 (04) : 1526A - 1527A
  • [36] Identification of Patients in Need of Advanced Care for Depression Using Data Extracted From a Statewide Health Information Exchange: A Machine Learning Approach
    Kasthurirathne, Suranga N.
    Biondich, Paul G.
    Grannis, Shaun J.
    Purkayastha, Saptarshi
    Vest, Joshua R.
    Jones, Josette F.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2019, 21 (07)
  • [37] Decision Tree Approaches to Select High Risk Patients for Lung Cancer Screening Based on the UK Primary Care Data
    Rai, Teena
    Shen, Yuan
    Kaur, Jaspreet
    He, Jun
    Mahmud, Mufti
    Brown, David J.
    Baldwin, David R.
    O'Dowd, Emma
    Hubbard, Richard
    ARTIFICIAL INTELLIGENCE IN MEDICINE, AIME 2023, 2023, 13897 : 35 - 39
  • [38] Unsupervised machine learning using 3D seismic data applied to reservoir evaluation and rock type identification
    Hussein, Marwa
    Stewart, Robert R.
    Sacrey, Deborah
    Wu, Jonny
    Athale, Rajas
    INTERPRETATION-A JOURNAL OF SUBSURFACE CHARACTERIZATION, 2021, 9 (02): : T549 - T568
  • [39] Identification of rare diseases using electronic medical records - example of allergic bronchopulmonary aspergillosis in UK Primary care data
    Maguire, Andrew Maguire
    Johnson, Michelle E.
    Denning, David W.
    Ferreira, Germano L. C.
    Cassidy, Adrian
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2017, 26 : 287 - 288
  • [40] Identifying High-Need Primary Care Patients Using Nursing Knowledge and Machine Learning Methods
    Hewner, Sharon
    Smith, Erica
    Sullivan, Suzanne S.
    APPLIED CLINICAL INFORMATICS, 2023, 14 (03): : 408 - 417