Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research

被引:15
|
作者
Yagin, Burak [1 ]
Yagin, Fatma Hilal [1 ]
Colak, Cemil [1 ]
Inceoglu, Feyza [2 ]
Kadry, Seifedine [3 ,4 ,5 ]
Kim, Jungeun [6 ]
机构
[1] Inonu Univ, Fac Med, Dept Biostat & Med Informat, TR-44280 Malatya, Turkiye
[2] Malatya Turgut Ozal Univ, Fac Med, Dept Biostat, TR-44090 Malatya, Turkiye
[3] Noroff Univ Coll, Dept Appl Data Sci, N-4612 Kristiansand, Norway
[4] Ajman Univ, Artificial Intelligence Res Ctr AIRC, Ajman 346, U Arab Emirates
[5] Lebanese Amer Univ, Dept Elect & Comp Engn, Byblos 36, Lebanon
[6] Kongju Natl Univ, Dept Software, Cheonan 31080, South Korea
关键词
breast cancer metastasis; machine learning algorithms; genomic biomarkers; eXplainable artificial intelligence; SHAP; EXPRESSION; ASSOCIATION; PROGNOSIS;
D O I
10.3390/diagnostics13213314
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Aim: Method: This research presents a model combining machine learning (ML) techniques and eXplainable artificial intelligence (XAI) to predict breast cancer (BC) metastasis and reveal important genomic biomarkers in metastasis patients. Method: A total of 98 primary BC samples was analyzed, comprising 34 samples from patients who developed distant metastases within a 5-year follow-up period and 44 samples from patients who remained disease-free for at least 5 years after diagnosis. Genomic data were then subjected to biostatistical analysis, followed by the application of the elastic net feature selection method. This technique identified a restricted number of genomic biomarkers associated with BC metastasis. A light gradient boosting machine (LightGBM), categorical boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Gradient Boosting Trees (GBT), and Ada boosting (AdaBoost) algorithms were utilized for prediction. To assess the models' predictive abilities, the accuracy, F1 score, precision, recall, area under the ROC curve (AUC), and Brier score were calculated as performance evaluation metrics. To promote interpretability and overcome the "black box" problem of ML models, a SHapley Additive exPlanations (SHAP) method was employed. Results: The LightGBM model outperformed other models, yielding remarkable accuracy of 96% and an AUC of 99.3%. In addition to biostatistical evaluation, in XAI-based SHAP results, increased expression levels of TSPYL5, ATP5E, CA9, NUP210, SLC37A1, ARIH1, PSMD7, UBQLN1, PRAME, and UBE2T (p <= 0.05) were found to be associated with an increased incidence of BC metastasis. Finally, decreased levels of expression of CACTIN, TGFB3, SCUBE2, ARL4D, OR1F1, ALDH4A1, PHF1, and CROCC (p <= 0.05) genes were also determined to increase the risk of metastasis in BC. Conclusion: The findings of this study may prevent disease progression and metastases and potentially improve clinical outcomes by recommending customized treatment approaches for BC patients.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Explainable Artificial Intelligence (XAI) and Machine Learning Technique for Prediction of Properties in Additive Manufacturing
    Abbili, Kiran Kumar
    JOURNAL OF ADVANCED MANUFACTURING SYSTEMS, 2025, 24 (02) : 229 - 240
  • [32] Interpretable Prediction of a Decentralized Smart Grid Based on Machine Learning and Explainable Artificial Intelligence
    Cifci, Ahmet
    IEEE ACCESS, 2025, 13 : 36285 - 36305
  • [33] Exploring the Efficacy of Artificial Intelligence in Speed Prediction: Explainable Machine-Learning Approach
    Jain, Vineet
    Chouhan, Rajesh
    Dhamaniya, Ashish
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2025, 39 (02)
  • [34] Risk Prediction of Diabetic Foot Amputation Using Machine Learning and Explainable Artificial Intelligence
    Oei, Chien Wei
    Chan, Yam Meng
    Zhang, Xiaojin
    Leo, Kee Hao
    Yong, Enming
    Chong, Rhan Chaen
    Hong, Qiantai
    Zhang, Li
    Pan, Ying
    Tan, Glenn Wei Leong
    Mak, Malcolm Han Wen
    JOURNAL OF DIABETES SCIENCE AND TECHNOLOGY, 2024,
  • [35] Machine learning-based prediction model for distant metastasis of breast cancer
    Duan, Hao
    Zhang, Yu
    Qiu, Haoye
    Fu, Xiuhao
    Liu, Chunling
    Zang, Xiaofeng
    Xu, Anqi
    Wu, Ziyue
    Li, Xingfeng
    Zhang, Qingchen
    Zhang, Zilong
    Cui, Feifei
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [36] Prediction of survival and metastasis in breast cancer patients using machine learning classifiers
    Tapak, Leili
    Shirmohammadi-Khorram, Nasrin
    Amini, Payam
    Alafchi, Behnaz
    Hamidi, Omid
    Poorolajal, Jalal
    CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH, 2019, 7 (03): : 293 - 299
  • [37] Revolutionizing Cancer Research and Drug Discovery: The Role of Artificial Intelligence and Machine Learning
    Paliwal, Ajita
    Alam, Md Aftab
    Sharma, Preeti
    Jain, Smita
    Dhoundiyal, Shivang
    CURRENT CANCER THERAPY REVIEWS, 2024,
  • [38] Software Defects Identification: Results Using Machine Learning and Explainable Artificial Intelligence Techniques
    Begum, Momotaz
    Shuvo, Mehedi Hasan
    Ashraf, Imran
    Al Mamun, Abdullah
    Uddin, Jia
    Samad, Md Abdus
    IEEE ACCESS, 2023, 11 : 132750 - 132765
  • [39] Artificial Intelligence-Powered Imaging Biomarker Based on Mammography for Breast Cancer Risk Prediction
    Park, Eun Kyung
    Lee, Hyeonsoo
    Kim, Minjeong
    Kim, Taesoo
    Kim, Junha
    Kim, Ki Hwan
    Kooi, Thijs
    Chang, Yoosoo
    Ryu, Seungho
    DIAGNOSTICS, 2024, 14 (12)
  • [40] Patho-Net: enhancing breast cancer classification using deep learning and explainable artificial intelligence
    Manojee, Kalappanaickenpatty Suriaprakasam
    Kannan, Athiappan Rajiv
    AMERICAN JOURNAL OF CANCER RESEARCH, 2025, 15 (02): : 754 - 768