A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery

被引:15
|
作者
Wang, Hao [1 ,2 ]
Zhang, Zhaoyue [3 ]
Li, Haicheng [1 ,2 ]
Li, Jinzhao [1 ]
Li, Hanshuang [1 ]
Liu, Mingzhu [1 ,2 ]
Liang, Pengfei [1 ]
Xi, Qilemuge [1 ]
Xing, Yongqiang [4 ]
Yang, Lei [5 ]
Zuo, Yongchun [1 ,2 ]
机构
[1] Inner Mongolia Univ, Coll Life Sci, State Key Lab Reprod Regulat & Breeding Grassland, Hohhot 010070, Peoples R China
[2] Inner Mongolia Wesure Date Technol Co Ltd, Inner Mongolia Intelligent Union Big Data Acad, Digital Coll, Hohhot 010010, Peoples R China
[3] Univ Elect Sci & Technol China, Ctr Informat Biol, Sch Life Sci & Technol, Chengdu 610054, Peoples R China
[4] Inner Mongolia Univ Sci & Technol, Sch Life Sci & Technol, Baotou 014010, Peoples R China
[5] Harbin Med Univ, Coll Bioinformat Sci & Technol, Harbin 150081, Peoples R China
来源
CELL AND BIOSCIENCE | 2023年 / 13卷 / 01期
关键词
Preeclampsia risk; Machine learning; Feature selection; Marker genes; Web server; SINGLE-CELL; CANCER CLASSIFICATION; DIFFERENTIATION; EXPRESSION; IDENTIFICATION; PREDICTION;
D O I
10.1186/s13578-023-00991-y
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays a vital role in the formulation of treatment plans. However, the traditional clinical methods of PE have a high misdiagnosis rate.Results Here, we first designed a computational biology method that used single-cell transcriptome (scRNA-seq) of healthy pregnancy (38 wk) and early-onset PE (28-32 wk) to identify pathological cell subpopulations and predict PE risk. Based on machine learning methods and feature selection techniques, we observed that the Tuning ReliefF (TURF) score hybrid with XGBoost (TURF_XGB) achieved optimal performance, with 92.61% accuracy and 92.46% recall for classifying nine cell subpopulations of healthy placentas. Biological landscapes of placenta heterogeneity could be mapped by the 110 marker genes screened by TURF_XGB, which revealed the superiority of the TURF feature mining. Moreover, we processed the PE dataset with LASSO to obtain 497 biomarkers. Integration analysis of the above two gene sets revealed that dendritic cells were closely associated with early-onset PE, and C1QB and C1QC might drive preeclampsia by mediating inflammation. In addition, an ensemble model-based risk stratification card was developed to classify preeclampsia patients, and its area under the receiver operating characteristic curve (AUC) could reach 0.99. For broader accessibility, we designed an accessible online web server ().Conclusion Single-cell transcriptome-based preeclampsia risk assessment using an ensemble machine learning framework is a valuable asset for clinical decision-making. C1QB and C1QC may be involved in the development and progression of early-onset PE by affecting the complement and coagulation cascades pathway that mediate inflammation, which has important implications for better understanding the pathogenesis of PE.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Machine learning-based detection of chemical risk
    Grabar, Natalia
    Wandji Tchamp, Ornella
    Maxim, Laura
    E-HEALTH - FOR CONTINUITY OF CARE, 2014, 205 : 725 - 729
  • [42] Development of a prediction model on preeclampsia using machine learning-based method: a retrospective cohort study in China
    Liu, Mengyuan
    Yang, Xiaofeng
    Chen, Guolu
    Ding, Yuzhen
    Shi, Meiting
    Sun, Lu
    Huang, Zhengrui
    Liu, Jia
    Liu, Tong
    Yan, Ruiling
    Li, Ruiman
    FRONTIERS IN PHYSIOLOGY, 2022, 13
  • [43] Effective alerting for bridge monitoring via a machine learning-based anomaly detection method
    Kang, Juntao
    Wang, Lei
    Zhang, Wenbin
    Hu, Jun
    Chen, Xingxiang
    Wang, Dong
    Yu, Zechuan
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2024,
  • [44] Machine learning-based detection of driver distraction by Capacitive Electrocardiogram Signals
    Skoric, Tamara
    Bajic, Dragana
    2024 23RD INTERNATIONAL SYMPOSIUM INFOTEH-JAHORINA, INFOTEH, 2024,
  • [45] Cost-Effective LLM Utilization for Machine Learning Tasks over Tabular Data
    Einy, Yael
    Milo, Tova
    Novgorodov, Slava
    FIRST WORKSHOP ON GOVERNANCE, UNDERSTANDING, AND INTEGRATION OF DATA FOR EFFECTIVE AND RESPONSIBLE AI, GUIDE-AI 2024, 2024, : 45 - 49
  • [46] A supervised machine learning model to select a cost-effective directional drilling tool
    Muhammad Nour
    Said K. Elsayed
    Omar Mahmoud
    Scientific Reports, 14 (1)
  • [47] A Cost-Effective Baugh-Wooley Approximate Multiplier for FPGA-based Machine Learning Computing
    Vakili, Shervin
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 367 - 371
  • [48] Cost-effective risk assessment of hand-arm vibration exposure
    Edwards, David
    Holt, Gary
    JOURNAL OF FINANCIAL MANAGEMENT OF PROPERTY AND CONSTRUCTION, 2010, 15 (02) : 158 - +
  • [49] Machine Learning Methods as a Cost-Effective Alternative to Physics-Based Binding Free Energy Calculations
    Bansal, Nupur
    Wang, Ye
    Sciabola, Simone
    MOLECULES, 2024, 29 (04):
  • [50] Cost Effective Assessment of Transformers Using Machine Learning Approach
    Benhmed, Kamel
    Shaban, Khaled Bashir
    El-Hag, Ayman
    2014 IEEE INNOVATIVE SMART GRID TECHNOLOGIES - ASIA (ISGT ASIA), 2014, : 328 - 332