Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: the Henan Rural Cohort Study

被引:0
|
作者
Liying Zhang
Yikang Wang
Miaomiao Niu
Chongjian Wang
Zhenfei Wang
机构
[1] Zhengzhou University,School of Information Engineering
[2] Zhengzhou University,Department of Epidemiology and Biostatistics, College of Public Health
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
With the development of data mining, machine learning offers opportunities to improve discrimination by analyzing complex interactions among massive variables. To test the ability of machine learning algorithms for predicting risk of type 2 diabetes mellitus (T2DM) in a rural Chinese population, we focus on a total of 36,652 eligible participants from the Henan Rural Cohort Study. Risk assessment models for T2DM were developed using six machine learning algorithms, including logistic regression (LR), classification and regression tree (CART), artificial neural networks (ANN), support vector machine (SVM), random forest (RF) and gradient boosting machine (GBM). The model performance was measured in an area under the receiver operating characteristic curve, sensitivity, specificity, positive predictive value, negative predictive value and area under precision recall curve. The importance of variables was identified based on each classifier and the shapley additive explanations approach. Using all available variables, all models for predicting risk of T2DM demonstrated strong predictive performance, with AUCs ranging between 0.811 and 0.872 using laboratory data and from 0.767 to 0.817 without laboratory data. Among them, the GBM model performed best (AUC: 0.872 with laboratory data and 0.817 without laboratory data). Performance of models plateaued when introduced 30 variables to each model except CART model. Among the top-10 variables across all methods were sweet flavor, urine glucose, age, heart rate, creatinine, waist circumference, uric acid, pulse pressure, insulin, and hypertension. New important risk factors (urinary indicators, sweet flavor) were not found in previous risk prediction methods, but determined by machine learning in our study. Through the results, machine learning methods showed competence in predicting risk of T2DM, leading to greater insights on disease risk factors with no priori assumption of causality.
引用
收藏
相关论文
共 50 条
  • [1] Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: the Henan Rural Cohort Study
    Zhang, Liying
    Wang, Yikang
    Niu, Miaomiao
    Wang, Chongjian
    Wang, Zhenfei
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [2] Associations of midpoint of sleep and night sleep duration with type 2 diabetes mellitus in Chinese rural population: the Henan rural cohort study
    Zhihan Zhai
    Xiaotian Liu
    Haiqing Zhang
    Xiaokang Dong
    Yaling He
    Miaomiao Niu
    Mingming Pan
    Chongjian Wang
    Xiaoqiong Wang
    Yuqian Li
    [J]. BMC Public Health, 21
  • [3] Associations of midpoint of sleep and night sleep duration with type 2 diabetes mellitus in Chinese rural population: the Henan rural cohort study
    Zhai, Zhihan
    Liu, Xiaotian
    Zhang, Haiqing
    Dong, Xiaokang
    He, Yaling
    Niu, Miaomiao
    Pan, Mingming
    Wang, Chongjian
    Wang, Xiaoqiong
    Li, Yuqian
    [J]. BMC PUBLIC HEALTH, 2021, 21 (01)
  • [4] Association of plant-based diet and type 2 diabetes mellitus in Chinese rural adults: The Henan Rural Cohort Study
    Yang, Xiu
    Li, Yuqian
    Wang, Chongjian
    Mao, Zhenxing
    Chen, Yu
    Ren, Pengfei
    Fan, Mengying
    Cui, Songyang
    Niu, Kailin
    Gu, Ruohua
    Li, Linlin
    [J]. JOURNAL OF DIABETES INVESTIGATION, 2021, 12 (09) : 1569 - 1576
  • [5] Prevalence of impaired fasting glucose, type 2 diabetes and associated risk factors in undiagnosed Chinese rural population: the Henan Rural Cohort Study
    Abdulai, Tanko
    Li, Yuqian
    Zhang, Haiqing
    Tu, Runqi
    Liu, Xiaotian
    Zhang, Liying
    Dong, Xiaokang
    Li, Ruiying
    Wang, Yuming
    Wang, Chongjian
    [J]. BMJ OPEN, 2019, 9 (08):
  • [6] Dietary Potassium and Magnesium Intake with Risk of Type 2 Diabetes Mellitus Among Rural China: the Henan Rural Cohort Study
    Li, Jia
    Li, Yuqian
    Wang, Chongjian
    Mao, Zhenxing
    Yang, Tianyu
    Li, Yan
    Xing, Wenguo
    Li, Zhuoyang
    Zhao, Jiaoyan
    Li, Linlin
    [J]. BIOLOGICAL TRACE ELEMENT RESEARCH, 2024, 202 (09) : 3932 - 3944
  • [7] Adiposity reduces the risk of osteoporosis in Chinese rural population: the Henan rural cohort study
    Huiling Tian
    Jun Pan
    Dou Qiao
    Xiaokang Dong
    Ruiying Li
    Yikang Wang
    Runqi Tu
    Tanko Abdulai
    Xiaotian Liu
    Jian Hou
    Gongyuan Zhang
    Chongjian Wang
    [J]. BMC Public Health, 20
  • [8] Adiposity reduces the risk of osteoporosis in Chinese rural population: the Henan rural cohort study
    Tian, Huiling
    Pan, Jun
    Qiao, Dou
    Dong, Xiaokang
    Li, Ruiying
    Wang, Yikang
    Tu, Runqi
    Abdulai, Tanko
    Liu, Xiaotian
    Hou, Jian
    Zhang, Gongyuan
    Wang, Chongjian
    [J]. BMC PUBLIC HEALTH, 2020, 20 (01)
  • [9] Mineralocorticoids, glucose homeostasis and type 2 diabetes mellitus: The Henan Rural Cohort study
    Wei, Dandan
    Liu, Xue
    Jiang, Jingjing
    Tu, Runqi
    Qiao, Dou
    Li, Ruiying
    Wang, Yikang
    Fan, Mengying
    Yang, Xiu
    Zhang, Jinyu
    Hou, Jian
    Huo, Wenqian
    Yu, Songcheng
    Li, Linlin
    Wang, Chongjian
    Mao, Zhenxing
    [J]. JOURNAL OF DIABETES AND ITS COMPLICATIONS, 2020, 34 (05)
  • [10] Age at menopause, body mass index, and risk of type 2 diabetes mellitus in postmenopausal Chinese women: The Henan Rural Cohort study
    Zhang, Lulu
    Bao, Lei
    Li, Yuqian
    Wang, Chongjian
    Dong, Xiaokang
    Abdulai, Tanko
    Yang, Xiu
    Fan, Mengying
    Cui, Songyang
    Zhou, Wen
    Mao, Zhenxing
    Huo, Wenqian
    Wei, Dandan
    Li, Linlin
    [J]. NUTRITION METABOLISM AND CARDIOVASCULAR DISEASES, 2020, 30 (08) : 1347 - 1354