Explainable machine learning model identified potential biomarkers in liver cancer survival prediction

被引:0
|
作者
Pan, Qi [1 ]
Hounye, Alphonse Houssou [1 ]
Miao, Kexin [1 ]
Su, Liuyan [1 ]
Wang, Jiaoju [1 ]
Hou, Muzhou [1 ]
Xiong, Li [2 ,3 ]
机构
[1] Cent South Univ, Sch Math & Stat, Changsha 410083, Peoples R China
[2] Cent South Univ, Xiangya Hosp 2, Dept Gen Surg, Changsha 410011, Peoples R China
[3] Hunan Clin Res Ctr Intelligent Gen Surg, Changsha 410011, Peoples R China
关键词
Random Forest; XGBoost; Support Vector Machine(SVM); SHAP; Immunogenic Cell Death (ICD); Prognostic model; CEP55;
D O I
10.1016/j.bspc.2024.106504
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective: Liver cancer is a malignant tumor with a high incidence, and common treatments include surgical resection, ablation, arterial catheterization, and liver transplantation. Enhancing the clinical evaluation and therapy management of LIHC is a crucial matter, and when incorporating machine learning methods into decision-making procedures, it is crucial to consider the comprehensibility of the models. In this current study, the SHapley Additive exPlanation (SHAP) technique was applied to interpret a gradient-boosting decision tree (XGBoost) model utilizing the Cancer Genome Atlas (TCGA) data for interpreting survival black-box models to identify the potential biomarkers for liver cancer survival prediction. Methods: The TCGA database is utilized to access expression data and clinical information for liver cancer samples, while Immunogenic Cell Death (ICD)-related genes were retrieved from the literature. Gene screening using bioinformatics methods and machine learning methods. The screened differentially expressed genes (DEGs) and ICDs were jointly constructed as the SurvMLSHAP model, and the SurvMLSHAP score was calculated. Three methods, bayesian optimization, random search, and genetic algorithm were used for parameter optimization. Eight machine learning models were built to evaluate the model's superiority and select the best model based on the suggested model. Results: The SurvMLSHAP model output was interpreted using the XGBoost-based SHAP method to assess the influence and significance of each feature. Tests conducted on both synthetic and medical data validate the capability of SurvMLSHAP to identify factors that have a time-dependent impact. The C-index of the raw data and validation data were 0.6844 and 0.8167, respectively. Furthermore, the aggregation of SurvMLSHAP yields a more accurate assessment of variable relevance for prediction compared to other existing approaches. The features contributing to the XGBoost model were, in order CEP55, PPIA, TTC36, HSP90AA1, which could be used as predictors to assess the liver hepatocellular carcinoma(LIHC) cohort, while the putative molecular subgroups could provide new ideas for individualized treatment of LIHC. Conclusion: In this study, a risk prognostic model was constructed called SurvMLSHAP based on bioinformatics and machine learning methods and screened for ICD-related biomarkers to assess the prognostic outcome of LIHC patients, which can provide personalized treatment for clinical patients.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Explainable Machine Learning Model to Prediction EGFR Mutation in Lung Cancer
    Yang, Ruiyuan
    Xiong, Xingyu
    Wang, Haoyu
    Li, Weimin
    [J]. FRONTIERS IN ONCOLOGY, 2022, 12
  • [2] WGCNA combined with machine learning to find potential biomarkers of liver cancer
    Lv, Jia-Hao
    Hou, A-Jiao
    Zhang, Shi-Hao
    Dong, Jiao-Jiao
    Kuang, Hai-Xue
    Yang, Liu
    Jiang, Hai
    [J]. MEDICINE, 2023, 102 (50) : E36536
  • [3] Explainable Machine Learning Model for Chronic Kidney Disease Prediction
    Arif, Muhammad Shoaib
    Rehman, Ateeq Ur
    Asif, Daniyal
    [J]. Algorithms, 2024, 17 (10)
  • [4] Development of an Explainable Heart Failure Patients Survival Status Prediction Model Using Machine Learning Algorithms
    Demis, Betimihirt Getnet Tsehay
    Yibre, Abdulkerim M.
    [J]. PAN-AFRICAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PT I, PANAFRICON AI 2023, 2024, 2068 : 117 - 137
  • [5] Machine Learning Model for Identifying Gene Biomarkers for Breast Cancer Treatment Survival
    Abou Tabl, Ashraf
    Alkhateeb, Abed
    ElMaraghy, Waguih
    Ngom, Alioune
    [J]. ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 607 - 607
  • [6] Interpretable machine learning model for prediction of overall survival in laryngeal cancer
    Alabi, Rasheed Omobolaji
    Almangush, Alhadi
    Elmusrati, Mohammed
    Leivo, Ilmo
    Makitie, Antti A.
    [J]. ACTA OTO-LARYNGOLOGICA, 2024,
  • [7] Prediction Model of Breast Cancer Survival Months: A Machine Learning Approach
    Naser, Mohammad Y. M.
    Chambers, Destini
    Bhattacharya, Sylvia
    [J]. SOUTHEASTCON 2023, 2023, : 851 - 855
  • [8] Explainable Machine Learning-Based Prediction Model for Diabetic Nephropathy
    Yin, Jing-Mei
    Li, Yang
    Xue, Jun-Tang
    Zong, Guo-Wei
    Fang, Zhong-Ze
    Zou, Lang
    [J]. JOURNAL OF DIABETES RESEARCH, 2024, 2024
  • [9] Explainable Machine Learning Model for Performance Prediction MAC Layer in WSNs
    Alaoui, El Arbi Abdellaoui
    Nassiri, Khalid
    Tekouabou, Stephane Cedric Koumetio
    [J]. EMERGING TRENDS IN INTELLIGENT SYSTEMS & NETWORK SECURITY, 2023, 147 : 232 - 241
  • [10] An Explainable Machine Learning Model for Material Backorder Prediction in Inventory Management
    Ntakolia, Charis
    Kokkotis, Christos
    Karlsson, Patrik
    Moustakidis, Serafeim
    [J]. SENSORS, 2021, 21 (23)