A Risk Prediction Model for Type 2 Diabetes Based on Weighted Feature Selection of Random Forest and XGBoost Ensemble Classifier

被引:6
|
作者
Xu, Zhongxian [1 ]
Wang, Zhiliang [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing, Peoples R China
关键词
diagnosis of diabetes; data mining; weighted feature selection; random forest; extreme gradient boosting;
D O I
10.1109/icaci.2019.8778622
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Type 2 diabetes mellitus is a severe chronic disease threatening human health and has a high incidence worldwide. People need to use effective prediction model to diagnose and prevent diabetes in time. At present, data mining technology has become an increasingly important technology with classification capability in the field of medical diagnosis. This paper proposes a risk prediction model for type 2 diabetes based on ensemble learning method. In the proposed model, the weighted feature selection algorithm based on random forest (RF-WFS) is used for optimal feature selection, and extreme gradient boosting (XGBoost) classifier. The effectiveness of the method was validated by comparing the various performance metrics and the results of different contrast experiments. Additionally, we get a better prediction accuracy using the method than using the other classification algorithms (C4.5, Naive Bayes, AdaBoost, Random Forest). The validation results at CO Pima Indian diabetes dataset shows that the model has better accuracy and classification performance than other research results mentioned in the literature. As a result, it has been proven that the model would be effective for the diagnosis of diabetes at the initial stage.
引用
收藏
页码:278 / 283
页数:6
相关论文
共 50 条
  • [1] Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique
    Gundogdu, Serdar
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (22) : 34163 - 34181
  • [2] Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique
    Serdar Gündoğdu
    [J]. Multimedia Tools and Applications, 2023, 82 : 34163 - 34181
  • [3] Construction of a diagnostic classifier for cervical intraepithelial neoplasia and cervical cancer based on XGBoost feature selection and random forest model
    Zhang, Jing
    Yang, Xiuqing
    Chen, Jia
    Han, Jing
    Chen, Xiaofeng
    Fan, Yueping
    Zheng, Hui
    [J]. JOURNAL OF OBSTETRICS AND GYNAECOLOGY RESEARCH, 2023, 49 (01) : 296 - 303
  • [4] The Risk Prediction of Type 2 Diabetes based on XGBoost
    Ji, Wei
    Lin, Shaofu
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON MECHANICAL, ELECTRONIC AND ENGINEERING TECHNOLOGY (MEET 2019), 2019, : 145 - 150
  • [5] A diabetes prediction model based on Boruta feature selection and ensemble learning
    Zhou, Hongfang
    Xin, Yinbo
    Li, Suli
    [J]. BMC BIOINFORMATICS, 2023, 24 (01)
  • [6] A diabetes prediction model based on Boruta feature selection and ensemble learning
    Hongfang Zhou
    Yinbo Xin
    Suli Li
    [J]. BMC Bioinformatics, 24
  • [7] Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
    Wang, Liyang
    Wang, Xiaoya
    Chen, Angxuan
    Jin, Xian
    Che, Huilian
    [J]. HEALTHCARE, 2020, 8 (03)
  • [8] Classifying Model of Ancient Glass Products Based on Ensemble Feature Selection and Random Forest
    Lu J.
    [J]. Kuei Suan Jen Hsueh Pao/Journal of the Chinese Ceramic Society, 2023, 51 (04): : 1060 - 1065
  • [9] Improving the Accuracy of Ensemble Classifier Prediction model based on FLAME Clustering with Random Forest Algorithm
    Augusty, Seena Mary
    Izudheen, Sminu
    [J]. 2013 THIRD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS (ICACC 2013), 2013, : 269 - 273
  • [10] Forest optimization algorithm-based feature selection using classifier ensemble
    Moorthy, Usha
    Gandhi, Usha Devi
    [J]. COMPUTATIONAL INTELLIGENCE, 2020, 36 (04) : 1445 - 1462