Explainable Machine Learning Model to Prediction EGFR Mutation in Lung Cancer

被引:5
|
作者
Yang, Ruiyuan [1 ]
Xiong, Xingyu [1 ]
Wang, Haoyu [1 ]
Li, Weimin [1 ,2 ,3 ,4 ]
机构
[1] Sichuan Univ, West China Hosp, Dept Resp & Crit Care Med, Chengdu, Peoples R China
[2] Sichuan Univ, West China Hosp, Inst Resp Hlth Frontiers Sci Ctr Dis Related Mol N, Chengdu, Peoples R China
[3] Sichuan Univ, West China Hosp, Precis Med Ctr, Precis Med Key Lab Sichuan Prov, Chengdu, Peoples R China
[4] West China Hosp, Chinses Acad Med Sci, Res Units West China, Chengdu, Peoples R China
来源
FRONTIERS IN ONCOLOGY | 2022年 / 12卷
基金
中国国家自然科学基金;
关键词
EGFR mutation; lung cancer; prediction; machine learning; SHAP value; MOLECULAR EPIDEMIOLOGY; SYSTEMIC THERAPY; ADENOCARCINOMA; RISK;
D O I
10.3389/fonc.2022.924144
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
ObjectivesThe aim of this study is to determine whether the clinical features including blood markers can establish an explainable machine learning model to predict epidermal growth factor receptor (EGFR) mutation in lung cancer. MethodsWe retrospectively analyzed 7,413 patients with lung adenocarcinoma (LA) diagnosed by gene sequencing in West China Hospital of the Sichuan University from April 2015 to June 2019. The machine learning algorithms (MLAs) included logistic regression (LR), random forest (RF), LightGBM, support vector machine (SVM), multi-layer perceptron (MLP), extreme gradient boosting (XGBoost), and decision tree (DT). Demographic characteristics, personal history, and blood markers were taken into. The area under the receiver operating characteristic curve (AUC) and SHapley Additive exPlanation (SHAP) value were used to explain the prediction models. ResultsOf the 7,413 patients with LA (47.6%), 3,527 were identified with EGFR mutation; RF achieved greatest performance in predicting EGFR mutation AUC [0.771, 95% confidence interval (CI): 0.770, 0.772], which was like XGBoost with AUC (0.740, 95% CI: 0.739, 0.741). The five most influential features were smoking consumption, sex, cholesterol, age, and albumin globulin ratio. The SHAP summary and dependence plot have been used to explain the affection of the 12 features to this model and how a single feature influences the output, respectively. ConclusionWe established EGFR mutation prediction models by MLAs and revealed that the RF was preferred, AUC (0.771, 95% CI: 0.770, 0.772), which was better than the traditional models. Therefore, the artificial intelligence-based MLA predicting model may become a practical tool to guide in diagnosis and therapy of LA.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Machine Learning and Feature Selection Methods for EGFR Mutation Status Prediction in Lung Cancer
    Morgado, Joana
    Pereira, Tania
    Silva, Francisco
    Freitas, Claudia
    Negrao, Eduardo
    de Lima, Beatriz Flor
    da Silva, Miguel Correia
    Madureira, Antonio J.
    Ramos, Isabel
    Hespanhol, Venceslau
    Costa, Jose Luis
    Cunha, Antonio
    Oliveira, Helder P.
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (07):
  • [2] An explainable machine learning framework for lung cancer hospital length of stay prediction
    Alsinglawi, Belal
    Alshari, Osama
    Alorjani, Mohammed
    Mubin, Omar
    Alnajjar, Fady
    Novoa, Mauricio
    Darwish, Omar
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [3] An explainable machine learning framework for lung cancer hospital length of stay prediction
    Belal Alsinglawi
    Osama Alshari
    Mohammed Alorjani
    Omar Mubin
    Fady Alnajjar
    Mauricio Novoa
    Omar Darwish
    [J]. Scientific Reports, 12
  • [4] Explainable Machine Learning for Lung Cancer Screening Models
    Kobylinska, Katarzyna
    Orlowski, Tadeusz
    Adamek, Mariusz
    Biecek, Przemyslaw
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (04):
  • [5] Explainable machine learning model identified potential biomarkers in liver cancer survival prediction
    Pan, Qi
    Hounye, Alphonse Houssou
    Miao, Kexin
    Su, Liuyan
    Wang, Jiaoju
    Hou, Muzhou
    Xiong, Li
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 96
  • [6] Explainable Machine Learning Model for Chronic Kidney Disease Prediction
    Arif, Muhammad Shoaib
    Rehman, Ateeq Ur
    Asif, Daniyal
    [J]. Algorithms, 2024, 17 (10)
  • [7] Ensemble Strategies for EGFR Mutation Status Prediction in Lung Cancer
    Malafaia, Mafalda
    Pereira, Tania
    Silva, Francisco
    Morgado, Joana
    Cunha, Antonio
    Oliveira, Helder P.
    [J]. 2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 3285 - 3288
  • [8] Model for predicting EGFR mutation status in lung cancer
    Lam Nguyen Ho
    Thuong Vu Le
    [J]. BREATHE, 2019, 15 (04) : 340 - 342
  • [9] EGFR Mutation Prediction of Lung Biopsy Images using Deep Learning
    Gupta, Ravi Kant
    Nandgaonkar, Shivani
    Kurian, Nikhil Cherian
    Bameta, Tripti
    Yadav, Subhash
    Kaushal, Rajiv Kumar
    Rane, Swapnil
    Sethi, Amit
    [J]. arXiv, 2022,
  • [10] Explainable Machine Learning-Based Prediction Model for Diabetic Nephropathy
    Yin, Jing-Mei
    Li, Yang
    Xue, Jun-Tang
    Zong, Guo-Wei
    Fang, Zhong-Ze
    Zou, Lang
    [J]. JOURNAL OF DIABETES RESEARCH, 2024, 2024