Machine learning to identify chronic cough from administrative claims data

被引:0
|
作者
Bali, Vishal [1 ]
Turzhitsky, Vladimir [1 ]
Schelfhout, Jonathan [1 ]
Paudel, Misti [2 ]
Hulbert, Erin [2 ]
Peterson-Brandt, Jesse [2 ]
Hertzberg, Jeffrey [3 ]
Kelly, Neal R. [3 ]
Patel, Raja H. [3 ]
机构
[1] Merck & Co Inc, Ctr Observat & Real World Evidence CORE, Rahway, NJ 07065 USA
[2] Optum Insight, Hlth Econ & Outcomes Res HEOR, Eden Prairie, MN USA
[3] OptumLabs, Minnetonka, MN USA
关键词
HEART-FAILURE; DIAGNOSIS; MODELS; ADULTS;
D O I
10.1038/s41598-024-51522-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurate identification of patient populations is an essential component of clinical research, especially for medical conditions such as chronic cough that are inconsistently defined and diagnosed. We aimed to develop and compare machine learning models to identify chronic cough from medical and pharmacy claims data. In this retrospective observational study, we compared 3 machine learning algorithms based on XG Boost, logistic regression, and neural network approaches using a large claims and electronic health record database. Of the 327,423 patients who met the study criteria, 4,818 had chronic cough based on linked claims-electronic health record data. The XG Boost model showed the best performance, achieving a Receiver-Operator Characteristic Area Under the Curve (ROC-AUC) of 0.916. We selected a cutoff that favors a high positive predictive value (PPV) to minimize false positives, resulting in a sensitivity, specificity, PPV, and negative predictive value of 18.0%, 99.6%, 38.7%, and 98.8%, respectively on the held-out testing set (n = 82,262). Logistic regression and neural network models achieved slightly lower ROC-AUCs of 0.907 and 0.838, respectively. The XG Boost and logistic regression models maintained their robust performance in subgroups of individuals with higher rates of chronic cough. Machine learning algorithms are one way of identifying conditions that are not coded in medical records, and can help identify individuals with chronic cough from claims data with a high degree of classification value.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Machine learning to identify chronic cough from administrative claims data
    Vishal Bali
    Vladimir Turzhitsky
    Jonathan Schelfhout
    Misti Paudel
    Erin Hulbert
    Jesse Peterson-Brandt
    Jeffrey Hertzberg
    Neal R. Kelly
    Raja H. Patel
    [J]. Scientific Reports, 14
  • [2] An algorithm to identify patients with refractory chronic cough from administrative claims data: a feasibility study
    van Boemmel-Wegmann, S.
    Pires, P. Vieira
    Herrera, R.
    Vora, P.
    Hajizadeh, N.
    [J]. EUROPEAN RESPIRATORY JOURNAL, 2022, 60
  • [3] Development and validation of a machine learning algorithm to identify anaphylaxis in US administrative claims data
    Beachler, Daniel C.
    Taylor, Devon H.
    Anthony, Mary S.
    Yin, Ruihua
    Li, Ling
    Saltus, Catherine W.
    Li, Lin
    Shaunik, Alka
    Walsh, Kathleen E.
    Lanes, Stephan
    Rothman, Kenneth J.
    Johannes, Catherine
    Aroda, Vanita
    Carr, Warner
    Goldberg, Pinkus
    Accardi, Andrew
    O'Shura, J. Shane
    Sharma, Kristen
    Juhaeri, Juhaeri
    Wu, Chuntao
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2020, 29 : 602 - 602
  • [4] A predictive model to identify Parkinson disease from administrative claims data
    Nielsen, Susan Searles
    Warden, Mark N.
    Camacho-Soto, Alejandra
    Willis, Allison W.
    Wright, Brenton A.
    Racette, Brad A.
    [J]. NEUROLOGY, 2017, 89 (14) : 1448 - 1456
  • [5] Using Machine Learning to Develop a Prediction Model to Identify Patients with Hemophilia A in an Administrative Claims Database
    Lyons, Jennifer L.
    Desai, Vibha
    Jemison, Jamileh
    Xu, Yaping
    Ridgeway, Greg
    Finkle, William
    Solari, Paul G.
    Sullivan, Sean D.
    Lanes, Stephan
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2017, 26 : 210 - 211
  • [6] An algorithm to identify preterm infants in administrative claims data
    Eworuke, Efe
    Hampp, Christian
    Saidi, Arwa
    Winterstein, Almut G.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2012, 21 (06) : 640 - 650
  • [7] Development of a Claims-Based Algorithm to Identify Patients with Chronic Cough
    Bali, V.
    Weaver, J.
    Turzhitsky, V.
    Schelfhout, J.
    Paudel, M.
    Hulbert, E.
    Peterson-Brandt, J.
    Hertzberg, J.
    Kelly, N. R.
    Patel, R. H.
    [J]. AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2021, 203 (09)
  • [8] An algorithm to identify gabapentin misuse and/or abuse in administrative claims data
    Zhao, Danni
    Nunes, Anthony P.
    Baek, Jonggyu
    Lapane, Kate L.
    [J]. DRUG AND ALCOHOL DEPENDENCE, 2022, 235
  • [9] A method for identifying patients with chronic angina from administrative claims data
    Watson, JB
    Lee, DW
    Kadlubek, PJ
    Haberman, M
    Goldberg, GA
    [J]. VALUE IN HEALTH, 2004, 7 (06) : 707 - 707
  • [10] Reader response: A predictive model to identify Parkinson disease from administrative claims data
    Kawada, Tomoyuki
    [J]. NEUROLOGY, 2018, 91 (02) : 104 - 104