Identification of potentially undiagnosed patients with nontuberculous mycobacteria lung disease using machine learning applied to primary care data in the UK

被引:22
|
作者
Doyle, Orla M. [1 ]
van der Laan, Roald [2 ]
Obradovic, Marko [2 ]
McMahon, Peter [3 ]
Daniels, Flora [4 ]
Pitcher, Ashley [5 ]
Loebinger, Michael R. [6 ,7 ]
机构
[1] IQVIA, Real World Analyt Solut, Predict Analyt, London, England
[2] Insmed Utrecht, Utrecht, Netherlands
[3] IQVIA, Real World Insights, London, England
[4] IQVIA, Real World Insights, Basel, Switzerland
[5] IQVIA, Real World Insights, Copenhagen, Denmark
[6] Royal Brompton & Harefield NHS Fdn Trust, London, England
[7] Imperial Coll London, London, England
关键词
PULMONARY-DISEASE; FUNCTION DECLINE; MORTALITY; INFECTIONS; GERMANY; RISK; COPD;
D O I
10.1183/13993003.00045-2020
中图分类号
R56 [呼吸系及胸部疾病];
学科分类号
摘要
Nontuberculous mycobacterial lung disease (NTMLD) is a rare lung disease often missed due to a low index of suspicion and unspecific clinical presentation. This retrospective study was designed to characterise the prediagnosis features of NTMLD patients in primary care and to assess the feasibility of using machine learning to identify undiagnosed NTMLD patients. IQVIA Medical Research Data (incorporating THIN, a Cegedim Database), a UK electronic medical records primary care database was used. NTMLD patients were identified between 2003 and 2017 by diagnosis in primary or secondary care or record of NTMLD treatment regimen. Risk factors and treatments were extracted in the prediagnosis period, guided by literature and expert clinical opinion. The control population was enriched to have at least one of these features. 741 NTMLD and 112 784 control patients were selected. Annual prevalence rates of NTMLD from 2006 to 2016 increased from 2.7 to 5.1 per 100000. The most common pre-existing diagnoses and treatments for NTMLD patients were COPD and asthma and penicillin, macrolides and inhaled corticosteroids. Compared to random testing, machine learning improved detection of patients with NTMLD by almost a thousand-fold with AUC of 0.94. The total prevalence of diagnosed and undiagnosed cases of NTMLD in 2016 was estimated to range between 9 and 16 per 100000. This study supports the feasibility of machine learning applied to primary care data to screen for undiagnosed NTMLD patients, with results indicating that there may be a substantial number of undiagnosed cases of NTMLD in the UK.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Selecting Lung Cancer Patients from UK Primary Care Data: A Longitudinal Study of Feature Trends
    Alzubaidi, Abeer
    Kaur, Jaspreet
    Mahmud, Mufti
    Brown, David J.
    He, Jun
    Ball, Graham
    Baldwin, David R.
    O'Dowd, Emma
    Hubbard, Richard B.
    APPLIED INTELLIGENCE AND INFORMATICS, AII 2021, 2021, 1435 : 43 - 59
  • [22] Identification of major cardiovascular events in patients with diabetes using primary care data
    Pouwels, Koen Bernardus
    Voorham, Jaco
    Hak, Eelko
    Denig, Petra
    BMC HEALTH SERVICES RESEARCH, 2016, 16
  • [23] Identification of major cardiovascular events in patients with diabetes using primary care data
    Koen Bernardus Pouwels
    Jaco Voorham
    Eelko Hak
    Petra Denig
    BMC Health Services Research, 16
  • [24] Almond cultivar identification using machine learning classifiers applied to UAV-based multispectral data
    Guimaraes, Nathalie
    Padua, Luis
    Sousa, Joaquim J.
    Bento, Albino
    Couto, Pedro
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (05) : 1533 - 1555
  • [25] Using Machine Learning to Refer Patients with Chronic Kidney Disease to Secondary Care
    Au-Yeung, Lee
    Xie, Xianghua
    Chess, James
    Scale, Timothy
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10219 - 10226
  • [26] Prevalence and clinical characteristics of patients with rheumatoid arthritis with interstitial lung disease using unstructured healthcare data and machine learning
    Roman Ivorra, Jose A.
    Trallero-Araguas, Ernesto
    Lopez Lasanta, Maria
    Cebrian, Laura
    Lojo, Leticia
    Lopez-Muniz, Belen
    Fernandez-Melon, Julia
    Nunez, Belen
    Silva-Fernandez, Lucia
    Veiga Cabello, Raul
    Ahijado, Pilar
    de la Morena Barrio, Isabel
    Costas Torrijo, Nerea
    Safont, Belen
    Ornilla, Enrique
    Restrepo, Juliana
    Campo, Arantxa
    Andreu, Jose L.
    Diez, Elvira
    Lopez Robles, Alejandra
    Bollo, Elena
    Benavent, Diego
    Vilanova, David
    Lujan Valdes, Sara
    Castellanos-Moreira, Raul
    RMD OPEN, 2024, 10 (01):
  • [27] Predicting Progression of Type 2 Diabetes Using Primary Care Data with the Help of Machine Learning
    Ozturk, Berk
    Lawton, Tom
    Smith, Stephen
    Habli, Ibrahim
    CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 38 - 42
  • [28] Using EHR data and machine learning approach to facilitate the identification of patients with lung cancer from a pan-cancer cohort
    Yu, Yue
    Ruddy, Kathryn Jean
    Leventakos, Konstantinos
    Liu, Bolun
    Huo, Nan
    Pachman, Deirdre R.
    Zong, Nansu
    Xiao, Guohui
    Chute, Christopher
    Pfaff, Emily
    Cheville, Andrea L.
    Jiang, Guoqian
    JOURNAL OF CLINICAL ONCOLOGY, 2023, 41 (16)
  • [29] Can we screen for pancreatic cancer? Identifying a sub-population of patients at high risk of subsequent diagnosis using machine learning techniques applied to primary care data
    Malhotra, Ananya
    Rachet, Bernard
    Bonaventure, Audrey
    Pereira, Stephen P.
    Woods, Laura M.
    PLOS ONE, 2021, 16 (06):
  • [30] Can we screen for pancreatic cancer? Identifying a sub-population of patients at high risk of subsequent diagnosis using machine learning techniques applied to primary care data
    Malhotra, A.
    Rachet, B.
    Bonaventure, A.
    Pereira, S.
    Woods, L.
    ANNALS OF ONCOLOGY, 2020, 31 : S221 - S222