Colon cancer diagnosis and staging classification based on machine learning and bioinformatics analysis

被引:50
|
作者
Su, Ying [1 ]
Tian, Xuecong [1 ]
Gao, Rui [3 ]
Guo, Wenjia [2 ]
Chen, Cheng [1 ]
Chen, Chen [3 ,4 ]
Jia, Dongfang [1 ]
Li, Hongtao [2 ]
Lv, Xiaoyi [1 ,5 ]
机构
[1] Xinjiang Univ, Coll Software, Urumqi 830046, Xinjiang, Peoples R China
[2] Xinjiang Med Univ, Affiliated Tumor Hosp, Urumqi 830011, Peoples R China
[3] Xinjiang Med Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[4] Cloud Comp Engn Technol Res Ctr Xinjiang, Kelamayi 834099, Peoples R China
[5] Xinjiang Univ, Key Lab Signal Detect & Proc, Urumqi 830046, Xinjiang, Peoples R China
关键词
Machine learning; Colon cancer; Prognosis; WGCNA; Staging; PPI; GENE-EXPRESSION;
D O I
10.1016/j.compbiomed.2022.105409
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Advanced metastasis of colon cancer makes it more difficult to treat colon cancer. Finding the markers of colon cancer (Colon Cancer) can diagnose the stage of cancer in time and improve the prognosis with timely treatment. This paper uses gene expression profiling data from The Cancer Genome Atlas (TCGA) for the diagnosis of colon cancer and its staging. In this study, we first selected the gene modules with the greatest correlation with cancer by Weighted Gene Co-expression Network Analysis (WGCNA), extracted the characteristic genes for differential expression results using the least absolute shrinkage and selection operator algorithm (Lasso) and performed survival analysis, and then combined the genes in the modules with the Lasso-extracted feature genes were combined to diagnose colon cancer versus healthy controls using RF, SVM and decision trees, and colon cancer staging was diagnosed using differentially expressed genes for each stage. Finally, Protein-Protein Interaction Networks (PPI) networks were done for 289 genes to identify clusters of aggregated proteins for survival analysis. Finally, the RF model had the best results in the diagnosis of colon cancer versus control group fold cross validation with an average accuracy of 99.81%, F1 value reaching 0.9968, accuracy of 99.88%, and recall of 99.5%, and an average accuracy of 91.5%, F1 value reaching 0.7679, accuracy of 86.94%, and recall in the diagnosis of colon cancer stages I, II, III and IV. The recall rate reached 73.04%, and eight genes associated with colon cancer prognosis were identified for GCNT2, GLDN, SULT1B1, UGT2B15, PTGDR2, GPR15, BMP5 and CPT2.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Classification Prediction of Lung Cancer Based on Machine Learning Method
    Li, Dantong
    Li, Guixin
    Li, Shuang
    Bang, Ashley
    [J]. INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS, 2024, 19 (01)
  • [42] A Hybrid Machine Learning Approach for the Phenotypic Classification of Metagenomic Colon Cancer Reads Based on Kmer Frequency and Biomarker Profiling
    Kishk, Ali
    Elzizy, Asmaa
    Galal, Dina
    Razek, Elham Abdel
    Fawzy, Esraa
    Ahmed, Gehad
    Gawish, Mohamed
    Hamad, Safwat
    El-Hadidi, Mohamed
    [J]. 2018 9TH CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE (CIBEC), 2018, : 118 - 121
  • [43] Machine Learning-Based Classification of the Health State of Mice Colon in Cancer Study from Confocal Laser Endomicroscopy
    Pejman Rasti
    Christian Wolf
    Hugo Dorez
    Raphael Sablong
    Driffa Moussata
    Salma Samiei
    David Rousseau
    [J]. Scientific Reports, 9
  • [44] Machine Learning-Based Classification of the Health State of Mice Colon in Cancer Study from Confocal Laser Endomicroscopy
    Rasti, Pejman
    Wolf, Christian
    Dorez, Hugo
    Sablong, Raphael
    Moussata, Driffa
    Samiei, Salma
    Rousseau, David
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [45] Machine learning for predicting colon cancer recurrence
    Kayikcioglu, Erkan
    Onder, Arif Hakan
    Bacak, Burcu
    Serel, Tekin Ahmet
    [J]. SURGICAL ONCOLOGY-OXFORD, 2024, 54
  • [46] Machine Learning-Based Analysis of MR Multiparametric Radiomics for the Subtype Classification of Breast Cancer
    Xie, Tianwen
    Wang, Zhe
    Zhao, Qiufeng
    Bai, Qianming
    Zhou, Xiaoyan
    Gu, Yajia
    Peng, Weijun
    Wang, He
    [J]. FRONTIERS IN ONCOLOGY, 2019, 9
  • [47] Aided diagnosis methods of breast cancer based on machine learning
    Zhao, Yue
    Wang, Nian
    Cui, Xiaoyu
    [J]. 2ND ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI2017), 2017, 887
  • [48] Automated Breast Cancer Diagnosis Based on Machine Learning Algorithms
    Dhahri, Habib
    Al Maghayreh, Eslam
    Mahmood, Awais
    Elkilani, Wail
    Nagi, Mohammed Faisal
    [J]. JOURNAL OF HEALTHCARE ENGINEERING, 2019, 2019
  • [49] A novel tool for the accurate and affordable early diagnosis of pancreatic cancer via machine learning and bioinformatics.
    Goel, Siya
    Honorio, Jean
    [J]. CANCER RESEARCH, 2021, 81 (13)
  • [50] Machine Learning Based Performance Development for Diagnosis of Breast Cancer
    Bektas, Burcu
    Babur, Sebahattin
    [J]. 2016 MEDICAL TECHNOLOGIES NATIONAL CONFERENCE (TIPTEKNO), 2015,