Copy number variation in plasma as a tool for lung cancer prediction using Extreme Gradient Boosting (XGBoost) classifier

被引:57
|
作者
Yu, Daping [1 ,2 ]
Liu, Zhidong [1 ,2 ]
Su, Chongyu [1 ,2 ]
Han, Yi [1 ,2 ]
Duan, XinChun [1 ,2 ]
Zhang, Rui [1 ,2 ]
Liu, Xiaoshuang [3 ]
Yang, Yang [4 ]
Xu, Shaofa [1 ,2 ]
机构
[1] Capital Med Univ, Beijing Chest Hosp, Thorac Surg Dept, Area 1st,9 Compound,Beiguan St, Beijing, Peoples R China
[2] Beijing TB & Thorac Tumor Res Inst, Area 1st,9 Compound,Beiguan St, Beijing, Peoples R China
[3] Ping An Hlth Technol, Beijing, Peoples R China
[4] Beijing Gencode Diagnost Lab, Beijing, Peoples R China
关键词
cfDNA; CNV; early diagnosis; lung cancer; XGBoost; CIRCULATING-TUMOR DNA; HEPATOCELLULAR-CARCINOMA; RISK-FACTORS;
D O I
10.1111/1759-7714.13204
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background The main cause of cancer death is lung cancer (LC) which usually presents at an advanced stage, but its early detection would increase the benefits of treatment. Blood is particularly favored in clinical research given the possibility of using it for relatively noninvasive analyses. Copy number variation (CNV) is a common genetic change in tumor genomes, and many studies have indicated that CNV-derived cell-free DNA (cfDNA) from plasma could be feasible as a biomarker for cancer diagnosis. Methods In this study, we determined the possibility of using chromosomal arm-level CNV from cfDNA as a biomarker for lung cancer diagnosis in a small cohort of 40 patients and 41 healthy controls. Arm-level CNV distributions were analyzed based on z score, and the machine-learning algorithm Extreme Gradient Boosting (XGBoost) was applied for cancer prediction. Results The results showed that amplifications tended to emerge on chromosomes 3q, 8q, 12p, and 7q. Deletions were frequently detected on chromosomes 22q, 3p, 5q, 16q, 10q, and 15q. Upon applying a trained XGBoost classifier, specificity and sensitivity of 100% were finally achieved in the test group (12 patients and 13 healthy controls). In addition, five-fold cross-validation proved the stability of the model. Finally, our results suggested that the integration of four arm-level CNVs and the concentration of cfDNA into the trained XGBoost classifier provides a potential method for detecting lung cancer. Conclusion Our results suggested that the integration of four arm-level CNVs and the concentration from of cfDNA integrated withinto the trained XGBoost classifier could become provides a potentially method for detecting lung cancer detection. Key points Significant findings of the study: Healthy individuals have different arm-level CNV profiles from cancer patients. Amplifications tend to emerge on chromosome 3q, 8q, 12p, 7q and deletions tend to emerge on chromosome 22q, 3p, 5q, 16q, 10q, 15q. What this study adds: CfDNA concentration, arm 10q, 3q, 8q, 3p, and 22q are key features for prediction. Trained XGBoost classifier is a potential method for lung cancer detection.
引用
收藏
页码:95 / 102
页数:8
相关论文
共 50 条
  • [21] A DNA copy number alteration classifier as a prognostic tool for prostate cancer patients
    Walead Ebrahimizadeh
    Karl-Philippe Guérard
    Shaghayegh Rouzbeh
    Eleonora Scarlata
    Fadi Brimo
    Palak G. Patel
    Tamara Jamaspishvili
    Lucie Hamel
    Armen G. Aprikian
    Anna Y. Lee
    David M. Berman
    John M. S. Bartlett
    Simone Chevalier
    Jacques Lapointe
    British Journal of Cancer, 2023, 128 : 2165 - 2174
  • [22] Difference of copy number variation in blood of patients with lung cancer
    Heo, Yeonjeong
    Heo, Jeongwon
    Han, Seon-sook
    Kim, Woo Jin
    Cheong, Hyun Sub
    Hong, Yoonki
    INTERNATIONAL JOURNAL OF BIOLOGICAL MARKERS, 2021, 36 (01): : 3 - 9
  • [24] Explainable machine learning (XML) framework for seismic assessment of structures using Extreme Gradient Boosting (XGBoost)
    Gharagoz, Masoum M.
    Noureldin, Mohamed
    Kim, Jinkoo
    ENGINEERING STRUCTURES, 2025, 327
  • [25] Characterizing and Identifying Autism Disorder Using Regional Connectivity Patterns and Extreme Gradient Boosting Classifier
    Epalle, Thomas M.
    Song, Yuqing
    Lu, Hu
    Liu, Zhe
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 570 - 579
  • [26] Ocean wave prediction using Long Short-Term Memory (LSTM) and Extreme Gradient Boosting (XGBoost) in Tuban Regency for fisherman safety
    Dhiya'ulhaq, Riswanda Ayu
    Safira, Anisya
    Fahmiyah, Indah
    Ghani, Mohammad
    METHODSX, 2024, 13
  • [27] Digital PCR for the Analysis of MYC Copy Number Variation in Lung Cancer
    Brik, Alexander
    Weber, Daniel G.
    Casjens, Swaantje
    Rozynek, Peter
    Meier, Swetlana
    Behrens, Thomas
    Stamatis, Georgios
    Darwiche, Kaid
    Theegarten, Dirk
    Bruening, Thomas
    Johnen, Georg
    DISEASE MARKERS, 2020, 2020
  • [28] Cryptocurrency Price Prediction Using Enhanced PSO with Extreme Gradient Boosting Algorithm
    Srivastava, Vibha
    Dwivedi, Vijay Kumar
    Singh, Ashutosh Kumar
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2023, 23 (02) : 170 - 187
  • [29] Prediction of protein ubiquitination sites via multi-view features based on eXtreme gradient boosting classifier
    Liu, Yushuang
    Jin, Shuping
    Song, Lili
    Han, Yu
    Yu, Bin
    JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2021, 107
  • [30] Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost)
    Kavzoglu, Taskin
    Teke, Alihan
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (06) : 7367 - 7385