Impact of selected pre-processing techniques on prediction of risk of early readmission for diabetic patients in India

被引:9
|
作者
Duggal, Reena [1 ]
Shukla, Suren [2 ]
Chandra, Sarika [3 ]
Shukla, Balvinder [4 ]
Khatri, Sunil Kumar [1 ]
机构
[1] Amity Univ Uttar Pradesh, Amity Inst Informat Technol, Noida, India
[2] OHUM Healthcare Solut Private Ltd, Noida, India
[3] Kailash Hosp, Noida, India
[4] Amity Univ Uttar Pradesh, Noida, India
关键词
Data mining; Diabetes; Feature selection; Missing value imputation; Predicting readmission rates; Pre-processing; VALIDATION;
D O I
10.1007/s13410-016-0495-4
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Diabetes is associated with increased risk of hospital readmission. Predicting risk of readmission of diabetic patients can facilitate implementing appropriate plans to prevent these readmissions. But the real-world medical data is noisy, inconsistent, and incomplete. So before building the prediction model, it is essential to pre-process the data efficiently and make it appropriate for predictive modelling. The objective of this study is to assess the impact of selected pre-processing techniques on the prediction of risk of 30-day readmission among patients with diabetes in India. De-identified electronic medical records data was used from a reputed hospital in the National Capital Region in India and included diabetes patients ae<yen>18 years old discharged from hospital in 2012 to 2015 (n = 9381). This paper focused on data pre-processing steps to improve readmission prediction outcomes. The impact of different pre-processing choices including feature selection, missing value imputation and data balancing on the classifier performance of logistic regression, Na < ve Bayes, and decision tree was assessed on various performance metrics such as area under curve, precision, recall, and accuracy. This comprehensive experimental study, first time done from Indian healthcare perspective, offered empirical evidence that most proposed models with pre-processing techniques significantly outperform the baseline methods (without any pre-processing) with respect to selected evaluation criteria. Area under curve (AUC) was highly increased with the use of oversampling technique as data is skewed on class label Readmission. Recall was the biggest gainer with range increasing from 0.02-0.23 to 0.78-0.85, and there was also an increase in AUC from range 0.56-0.68 to 0.83-0.86 by using pre-processing approach. Data pre-processing has a significant effect on hospital readmission predictive accuracy for patients with diabetes, with certain schemes proving inferior to competitive approaches. In addition, it is found that the impact of pre-processing schemes varies by technique, signifying formulation of different best practices to aid better results of a specific technique.
引用
收藏
页码:469 / 476
页数:8
相关论文
共 45 条
  • [21] Prediction of Psychiatric Readmission Risk in Psychosis Patients With Natural Language Processing of Electronic Health Records
    Mellado, Elena Alvarez
    Holderness, Eben
    Miller, Nicholas
    Bolton, Kirsten
    Cawkwell, Philip
    Pustejovsky, James
    Hall, Mei-Hua
    NEUROPSYCHOPHARMACOLOGY, 2019, 44 (SUPPL 1) : 187 - 187
  • [22] Customer information system data pre-processing with feature selection techniques for non-technical losses prediction in an electricity market
    Nizar, Anisah Hanim
    Zhao, Jun Hua
    Dong, Zhao Yang
    2006 INTERNATIONAL CONFERENCE ON POWER SYSTEMS TECHNOLOGY: POWERCON, VOLS 1- 6, 2006, : 2753 - +
  • [23] Improving the prediction accuracy of river inflow using two data pre-processing techniques coupled with data-driven model
    Nazir, Hafiza Mamona
    Hussain, Ijaz
    Faisal, Muhammad
    Elashkar, Elsayed Elsherbini
    Shoukry, Alaa Mohamd
    PEERJ, 2019, 7
  • [24] Impact of sensor data pre-processing strategies and selection of machine learning algorithm on the prediction of metritis events in dairy cattle
    Vidal, Gema
    Sharpnack, James
    Pinedo, Pablo
    Tsai, I. Ching
    Lee, Amanda Renee
    Martinez-Lopez, Beatriz
    PREVENTIVE VETERINARY MEDICINE, 2023, 215
  • [25] Impact of Spherical Coordinates Transformation Pre-processing in Deep Convolution Neural Networks for Brain Tumor Segmentation and Survival Prediction
    Russo, Carlo
    Liu, Sidong
    Di Ieva, Antonio
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES (BRAINLES 2020), PT I, 2021, 12658 : 295 - 306
  • [26] A novel drilling rate of penetration (ROP) prediction method using data pre-processing techniques and T-S fuzzy inference
    Wang, Xiang
    Gan, Chao
    Cao, Weihua
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 1261 - 1266
  • [27] An impact analysis of pre-processing techniques in spectroscopy data to classify insect-damaged in soybean plants with machine and deep learning methods
    Osco, Lucas Prado
    Furuya, Danielle Elis Garcia
    Furuya, Michelle Tafs Garcia
    Correa, Daniel Veras
    Goncalvez, Wesley Nunes
    Junior, Jose Marcato
    Borges, Miguel
    Blassioli-Moraes, Maria Carolina
    Michereff, Mirian Fernandes Furtado
    Aquino, Michely Ferreira Santos
    Laumann, Raul Alberto
    Lisenberg, Veraldo
    Ramos, Ana Paula Marques
    Jorge, Lucio Andre de Castro
    INFRARED PHYSICS & TECHNOLOGY, 2022, 123
  • [28] Impact of harmonization on the reproducibility of MRI radiomic features when using different scanners, acquisition parameters, and image pre-processing techniques: a phantom study
    Hajianfar, Ghasem
    Hosseini, Seyyed Ali
    Bagherieh, Sara
    Oveisi, Mehrdad
    Shiri, Isaac
    Zaidi, Habib
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2024, 62 (08) : 2319 - 2332
  • [29] The effectiveness of data pre-processing methods on the performance of machine learning techniques using RF, SVR, Cubist and SGB: a study on undrained shear strength prediction
    Demir, Selcuk
    Sahin, Emrehan Kutlug
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2024, 38 (08) : 3273 - 3290
  • [30] Developing a Generic Predictive Computational Model using Semantic data Pre-Processing with Machine Learning Techniques and its application for Stock Market Prediction Purposes
    Yerashenia, Natalia
    Bolotov, Alexander
    Fee, David Chan You
    2022 IEEE 24TH CONFERENCE ON BUSINESS INFORMATICS (CBI 2022), VOL 1, 2022, : 50 - 59