Identification of Breast Cancer Metastasis Markers from Gene Expression Profiles Using Machine Learning Approaches

被引:4
|
作者
Jung, Jinmyung [1 ]
Yoo, Sunyong [2 ]
机构
[1] Univ Suwon, Coll Informat & Commun Technol, Div Data Sci, Hwaseong 18323, South Korea
[2] Chonnam Natl Univ, Dept ICT Convergence Syst Engn, Gwangju 61005, South Korea
基金
新加坡国家研究基金会;
关键词
metastasis marker; gene expression; machine learning; XGBoost; breast cancer; feature importance; PROTEIN; REGULATOR; RESOURCE;
D O I
10.3390/genes14091820
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Cancer metastasis accounts for approximately 90% of cancer deaths, and elucidating markers in metastasis is the first step in its prevention. To characterize metastasis marker genes (MGs) of breast cancer, XGBoost models that classify metastasis status were trained with gene expression profiles from TCGA. Then, a metastasis score (MS) was assigned to each gene by calculating the inner product between the feature importance and the AUC performance of the models. As a result, 54, 202, and 357 genes with the highest MS were characterized as MGs by empirical p-value cutoffs of 0.001, 0.005, and 0.01, respectively. The three sets of MGs were compared with those from existing metastasis marker databases, which provided significant results in most comparisons (p-value < 0.05). They were also significantly enriched in biological processes associated with breast cancer metastasis. The three MGs, SPPL2C, KRT23, and RGS7, showed highly significant results (p-value < 0.01) in the survival analysis. The MGs that could not be identified by statistical analysis (e.g., GOLM1, ELAVL1, UBP1, and AZGP1), as well as the MGs with the highest MS (e.g., ZNF676, FAM163B, LDOC2, IRF1, and STK40), were verified via the literature. Additionally, we checked how close the MGs were to each other in the protein-protein interaction networks. We expect that the characterized markers will help understand and prevent breast cancer metastasis.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Identification of metastasis-related genes for predicting prostate cancer diagnosis, metastasis and immunotherapy drug candidates using machine learning approaches
    Wang, YaXuan
    Ji, Bo
    Zhang, Lu
    Wang, Jinfeng
    He, JiaXin
    Ding, BeiChen
    Ren, MingHua
    BIOLOGY DIRECT, 2024, 19 (01)
  • [22] Gene expression profiles and molecular markers to predict distant metastasis of early stage breast cancers.
    Wang, Y
    Atkins, D
    Zhang, Y
    Yang, F
    Jatkoe, T
    Talantov, D
    Sieuwerts, A
    Timmermans, M
    Berns, E
    Klijn, J
    Foekens, J
    BREAST CANCER RESEARCH AND TREATMENT, 2003, 82 : S120 - S120
  • [23] Breast Cancer Classification: Features Investigation Using Machine Learning Approaches
    Mashudi, Nurul Amirah
    Rossli, Syaidathul Amaleena
    Ahmad, Norulhusna
    Noor, Norliza Mohd
    INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2021, 13 (05): : 107 - 118
  • [24] Analysis of gene expression profiles of lung cancer subtypes with machine learning algorithms
    Yuan, Fei
    Lu, Lin
    Zou, Quan
    BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR BASIS OF DISEASE, 2020, 1866 (08):
  • [25] Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
    Tan, Mei Sze
    Chang, Siow-Wee
    Cheah, Phaik Leng
    Yap, Hwa Jen
    PEERJ, 2018, 6
  • [26] Patterns of Gene Expression Profiles Associated with Colorectal Cancer in Colorectal Mucosa by Using Machine Learning Methods
    Ren, Jing Xin
    Chen, Lei
    Guo, Wei
    Feng, Kai Yan
    Cai, Yu-Dong
    Huang, Tao
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2024, 27 (19) : 2921 - 2934
  • [27] Breast cancer prediction based on gene expression data using interpretable machine learning techniques
    Kallah-Dagadu, Gabriel
    Mohammed, Mohanad
    Nasejje, Justine B.
    Mchunu, Nobuhle Nokubonga
    Twabi, Halima S.
    Batidzirai, Jesca Mercy
    Singini, Geoffrey Chiyuzga
    Nevhungoni, Portia
    Maposa, Innocent
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [28] Identification of gene expression profiles that predict the aggressive behavior of breast cancer cells
    Zajchowski, DA
    Bartholdi, MF
    Gong, Y
    Webster, L
    Liu, HL
    Munishkin, A
    Beauheim, C
    Harvey, S
    Ethier, SP
    Johnson, PH
    CANCER RESEARCH, 2001, 61 (13) : 5168 - 5178
  • [29] A Survey of Machine Learning Approaches Applied to Gene Expression Analysis for Cancer Prediction
    Khalsan, Mahmood
    Machado, Lee R.
    Al-Shamery, Eman Salih
    Ajit, Suraj
    Anthony, Karen
    Mu, Mu
    Agyeman, Michael Opoku
    IEEE ACCESS, 2022, 10 : 27522 - 27534
  • [30] Identification of potential prognostic markers associated with lung metastasis in breast cancer by weighted gene co-expression network analysis
    Zhang, Xixun
    CANCER BIOMARKERS, 2022, 33 (03) : 299 - 310