Impact of selected pre-processing techniques on prediction of risk of early readmission for diabetic patients in India

被引:9
|
作者
Duggal, Reena [1 ]
Shukla, Suren [2 ]
Chandra, Sarika [3 ]
Shukla, Balvinder [4 ]
Khatri, Sunil Kumar [1 ]
机构
[1] Amity Univ Uttar Pradesh, Amity Inst Informat Technol, Noida, India
[2] OHUM Healthcare Solut Private Ltd, Noida, India
[3] Kailash Hosp, Noida, India
[4] Amity Univ Uttar Pradesh, Noida, India
关键词
Data mining; Diabetes; Feature selection; Missing value imputation; Predicting readmission rates; Pre-processing; VALIDATION;
D O I
10.1007/s13410-016-0495-4
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Diabetes is associated with increased risk of hospital readmission. Predicting risk of readmission of diabetic patients can facilitate implementing appropriate plans to prevent these readmissions. But the real-world medical data is noisy, inconsistent, and incomplete. So before building the prediction model, it is essential to pre-process the data efficiently and make it appropriate for predictive modelling. The objective of this study is to assess the impact of selected pre-processing techniques on the prediction of risk of 30-day readmission among patients with diabetes in India. De-identified electronic medical records data was used from a reputed hospital in the National Capital Region in India and included diabetes patients ae<yen>18 years old discharged from hospital in 2012 to 2015 (n = 9381). This paper focused on data pre-processing steps to improve readmission prediction outcomes. The impact of different pre-processing choices including feature selection, missing value imputation and data balancing on the classifier performance of logistic regression, Na < ve Bayes, and decision tree was assessed on various performance metrics such as area under curve, precision, recall, and accuracy. This comprehensive experimental study, first time done from Indian healthcare perspective, offered empirical evidence that most proposed models with pre-processing techniques significantly outperform the baseline methods (without any pre-processing) with respect to selected evaluation criteria. Area under curve (AUC) was highly increased with the use of oversampling technique as data is skewed on class label Readmission. Recall was the biggest gainer with range increasing from 0.02-0.23 to 0.78-0.85, and there was also an increase in AUC from range 0.56-0.68 to 0.83-0.86 by using pre-processing approach. Data pre-processing has a significant effect on hospital readmission predictive accuracy for patients with diabetes, with certain schemes proving inferior to competitive approaches. In addition, it is found that the impact of pre-processing schemes varies by technique, signifying formulation of different best practices to aid better results of a specific technique.
引用
收藏
页码:469 / 476
页数:8
相关论文
共 45 条
  • [41] Estimated Creatinine Clearance, Homocysteine and High Sensitivity-C-Reactive Protein Levels Determination for Early Prediction of Nephropathy and Atherosclerosis Risk In Type 2 Diabetic Patients
    Suwipar Deebukkhum
    Patchanrin Pingmuangkaew
    Orathai Tangvarasittichai
    Surapon Tangvarasittichai
    Indian Journal of Clinical Biochemistry, 2012, 27 (3) : 239 - 245
  • [42] A Model-Based Tool for Assessing the Impact of Land Use Change Scenarios on Flood Risk in Small-Scale River Systems-Part 1: Pre-Processing of Scenario Based Flood Characteristics for the Current State of Land Use
    Kachholz, Frauke
    Traenckner, Jens
    HYDROLOGY, 2021, 8 (03)
  • [43] Impact of early initiation of SGLT2 inhibitor on cardiovascular outcomes in diabetic patients with known atherosclerotic cardiovascular disease or risk factors: propensity score matched analysis
    Sun, W.
    Yan, B.
    EUROPEAN HEART JOURNAL, 2021, 42 : 2653 - 2653
  • [44] Clinical Impact of Risk Pre-warning Nursing Model Combined with Early Nasojejunal Enteral Nutrition on Nutritional Status, Neurological Function and Negative Emotions in Patients with Severe Stroke
    Qu, Dandan
    Gu, Dongmei
    Yi, Ping
    Xu, Chunxiang
    Zuo, Yanyu
    Ding, Caiyun
    CURRENT TOPICS IN NUTRACEUTICAL RESEARCH, 2024, 22 (01) : 160 - 165
  • [45] Impact of Dynamic 18F-FDG PET on the Early Prediction of Therapy Outcome in Patients with High-Risk Soft-Tissue Sarcomas After Neoadjuvant Chemotherapy: A Feasibility Study
    Dimitrakopoulou-Strauss, Antonia
    Strauss, Ludwig G.
    Egerer, Gerlinde
    Vasamiliette, Julie
    Mechtersheimer, Gunhild
    Schmitt, Thomas
    Lehner, Burkhard
    Haberkorn, Uwe
    Stroebel, Philipp
    Kasper, Bernd
    JOURNAL OF NUCLEAR MEDICINE, 2010, 51 (04) : 551 - 558