Impact of selected pre-processing techniques on prediction of risk of early readmission for diabetic patients in India

被引:9
|
作者
Duggal, Reena [1 ]
Shukla, Suren [2 ]
Chandra, Sarika [3 ]
Shukla, Balvinder [4 ]
Khatri, Sunil Kumar [1 ]
机构
[1] Amity Univ Uttar Pradesh, Amity Inst Informat Technol, Noida, India
[2] OHUM Healthcare Solut Private Ltd, Noida, India
[3] Kailash Hosp, Noida, India
[4] Amity Univ Uttar Pradesh, Noida, India
关键词
Data mining; Diabetes; Feature selection; Missing value imputation; Predicting readmission rates; Pre-processing; VALIDATION;
D O I
10.1007/s13410-016-0495-4
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Diabetes is associated with increased risk of hospital readmission. Predicting risk of readmission of diabetic patients can facilitate implementing appropriate plans to prevent these readmissions. But the real-world medical data is noisy, inconsistent, and incomplete. So before building the prediction model, it is essential to pre-process the data efficiently and make it appropriate for predictive modelling. The objective of this study is to assess the impact of selected pre-processing techniques on the prediction of risk of 30-day readmission among patients with diabetes in India. De-identified electronic medical records data was used from a reputed hospital in the National Capital Region in India and included diabetes patients ae<yen>18 years old discharged from hospital in 2012 to 2015 (n = 9381). This paper focused on data pre-processing steps to improve readmission prediction outcomes. The impact of different pre-processing choices including feature selection, missing value imputation and data balancing on the classifier performance of logistic regression, Na < ve Bayes, and decision tree was assessed on various performance metrics such as area under curve, precision, recall, and accuracy. This comprehensive experimental study, first time done from Indian healthcare perspective, offered empirical evidence that most proposed models with pre-processing techniques significantly outperform the baseline methods (without any pre-processing) with respect to selected evaluation criteria. Area under curve (AUC) was highly increased with the use of oversampling technique as data is skewed on class label Readmission. Recall was the biggest gainer with range increasing from 0.02-0.23 to 0.78-0.85, and there was also an increase in AUC from range 0.56-0.68 to 0.83-0.86 by using pre-processing approach. Data pre-processing has a significant effect on hospital readmission predictive accuracy for patients with diabetes, with certain schemes proving inferior to competitive approaches. In addition, it is found that the impact of pre-processing schemes varies by technique, signifying formulation of different best practices to aid better results of a specific technique.
引用
收藏
页码:469 / 476
页数:8
相关论文
共 45 条
  • [1] Impact of selected pre-processing techniques on prediction of risk of early readmission for diabetic patients in India
    Reena Duggal
    Suren Shukla
    Sarika Chandra
    Balvinder Shukla
    Sunil Kumar Khatri
    International Journal of Diabetes in Developing Countries, 2016, 36 : 469 - 476
  • [2] Appraisal of Pre-processing Techniques for Automated Detection of Diabetic Retinopathy
    Bhardwaj, Charu
    Jain, Shruti
    Sood, Meenakshi
    2018 FIFTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (IEEE PDGC), 2018, : 734 - 739
  • [3] Impact of applying pre-processing techniques for improving classification accuracy
    Sharmila, T. Sree
    Ramar, K.
    Raja, T. Sree Renga
    SIGNAL IMAGE AND VIDEO PROCESSING, 2014, 8 (01) : 149 - 157
  • [4] Impact of applying pre-processing techniques for improving classification accuracy
    T. Sree Sharmila
    K. Ramar
    T. Sree Renga Raja
    Signal, Image and Video Processing, 2014, 8 : 149 - 157
  • [5] Impact of Image Pre-Processing on Radiomics Feature Prediction Power in Recurrence Glioblastoma Patients
    Hajianfar, G.
    Shiri, I.
    Oveisi, M.
    Maleki, H.
    Haghparast, A.
    MEDICAL PHYSICS, 2018, 45 (06) : E215 - E215
  • [6] Impact of Data Pre-Processing Techniques on Deep Learning Based Power Attacks
    Aljuffri, Abdullah
    Reinbrecht, Cezar
    Hamdioui, Said
    Taouil, Mottaqiallah
    2021 16TH INTERNATIONAL CONFERENCE ON DESIGN & TECHNOLOGY OF INTEGRATED SYSTEMS IN NANOSCALE ERA (DTIS 2021), 2021,
  • [7] An evaluation of various data pre-processing techniques with machine learning models for water level prediction
    Ervin Shan Khai Tiu
    Yuk Feng Huang
    Jing Lin Ng
    Nouar AlDahoul
    Ali Najah Ahmed
    Ahmed Elshafie
    Natural Hazards, 2022, 110 : 121 - 153
  • [8] An evaluation of various data pre-processing techniques with machine learning models for water level prediction
    Tiu, Ervin Shan Khai
    Huang, Yuk Feng
    Ng, Jing Lin
    AlDahoul, Nouar
    Ahmed, Ali Najah
    Elshafie, Ahmed
    NATURAL HAZARDS, 2022, 110 (01) : 121 - 153
  • [9] Efficient Dengue Spread Prediction Using Machine Learning Models with Various Pre-processing Techniques
    Saraswathi, K.
    Rohini, K.
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [10] Predictive risk modelling for early hospital readmission of patients with diabetes in India
    Reena Duggal
    Suren Shukla
    Sarika Chandra
    Balvinder Shukla
    Sunil Kumar Khatri
    International Journal of Diabetes in Developing Countries, 2016, 36 : 519 - 528