Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique

被引:8
|
作者
Gundogdu, Serdar [1 ]
机构
[1] Dokuz Eylul Univ, Bergama Vocat Sch, Dept Comp Technol, Izmir, Turkiye
关键词
COVID-19; Diabetes; Feature selection; MLR; Random forest; XGBoost; MACHINE; MELLITUS; MODELS; PLASMA;
D O I
10.1007/s11042-023-15165-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetes is one of the most common and serious diseases affecting human health. Early diagnosis and treatment are vital to prevent or delay complications related to diabetes. An automated diabetes detection system assists physicians in the early diagnosis of the disease and reduces complications by providing fast and precise results. This study aims to introduce a technique based on a combination of multiple linear regression (MLR), random forest (RF), and XGBoost (XG) to diagnose diabetes from questionnaire data. MLR-RF algorithm is used for feature selection, and XG is used for classification in the proposed system. The dataset is the diabetic hospital data in Sylhet, Bangladesh. It contains 520 instances, including 320 diabetics and 200 control instances. The performance of the classifiers is measured concerning accuracy (ACC), precision (PPV), recall (SEN, sensitivity), F1 score (F1), and the area under the receiver-operating-characteristic curve (AUC). The results show that the proposed system achieves an accuracy of 99.2%, an AUC of 99.3%, and a prediction time of 0.04825 seconds. The feature selection method improves the prediction time, although it does not affect the accuracy of the four compared classifiers. The results of this study are quite reasonable and successful when compared with other studies. The proposed method can be used as an auxiliary tool in diagnosing diabetes.
引用
收藏
页码:34163 / 34181
页数:19
相关论文
共 50 条
  • [1] Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique
    Serdar Gündoğdu
    Multimedia Tools and Applications, 2023, 82 : 34163 - 34181
  • [2] A Risk Prediction Model for Type 2 Diabetes Based on Weighted Feature Selection of Random Forest and XGBoost Ensemble Classifier
    Xu, Zhongxian
    Wang, Zhiliang
    2019 ELEVENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI 2019), 2019, : 278 - 283
  • [3] Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier
    Paul, Desbordes
    Su, Ruan
    Romain, Modzelewski
    Sebastien, Vauclin
    Pierre, Vera
    Isabelle, Gardin
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2017, 60 : 42 - 49
  • [4] Feature Selection using Random Forest Classifier for Predicting Prostate Cancer
    Huljanah, Mia
    Rustam, Zuherman
    Utama, Suarsih
    Siswantining, Titin
    9TH ANNUAL BASIC SCIENCE INTERNATIONAL CONFERENCE 2019 (BASIC 2019), 2019, 546
  • [5] Classification models for likelihood prediction of diabetes at early stage using feature selection
    Oladimeji, Oladosu Oyebisi
    Oladimeji, Abimbola
    Oladimeji, Olayanju
    APPLIED COMPUTING AND INFORMATICS, 2024, 20 (3/4) : 279 - 286
  • [6] DDoS Detection Using Information Gain Feature Selection and Random Forest Classifier
    Mandala, Satria
    Ramadhan, Alvien Ihsan
    Rosalinda, Maya
    Zaki, Salim M.
    Weippl, Edgar
    2022 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT CYBERNETICS TECHNOLOGY & APPLICATIONS (ICICYTA), 2022, : 294 - 299
  • [7] Multimodal sentiment analysis using reliefF feature selection and random forest classifier
    Angadi S.
    Reddy V.S.
    International Journal of Computers and Applications, 2021, 43 (09) : 931 - 939
  • [8] Construction of a diagnostic classifier for cervical intraepithelial neoplasia and cervical cancer based on XGBoost feature selection and random forest model
    Zhang, Jing
    Yang, Xiuqing
    Chen, Jia
    Han, Jing
    Chen, Xiaofeng
    Fan, Yueping
    Zheng, Hui
    JOURNAL OF OBSTETRICS AND GYNAECOLOGY RESEARCH, 2023, 49 (01) : 296 - 303
  • [9] Design of intelligent diabetes mellitus detection system using hybrid feature selection based XGBoost classifier
    Prabha, Anju
    Yadav, Jyoti
    Rani, Asha
    Singh, Vijander
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 136
  • [10] Software Defect Prediction using Feature Selection and Random Forest Algorithm
    Ibrahim, Dyana Rashid
    Ghnemat, Rawan
    Hudaib, Amjad
    2017 INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2017, : 252 - 257