Identification of Breast Cancer Metastasis Markers from Gene Expression Profiles Using Machine Learning Approaches

被引:4
|
作者
Jung, Jinmyung [1 ]
Yoo, Sunyong [2 ]
机构
[1] Univ Suwon, Coll Informat & Commun Technol, Div Data Sci, Hwaseong 18323, South Korea
[2] Chonnam Natl Univ, Dept ICT Convergence Syst Engn, Gwangju 61005, South Korea
基金
新加坡国家研究基金会;
关键词
metastasis marker; gene expression; machine learning; XGBoost; breast cancer; feature importance; PROTEIN; REGULATOR; RESOURCE;
D O I
10.3390/genes14091820
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Cancer metastasis accounts for approximately 90% of cancer deaths, and elucidating markers in metastasis is the first step in its prevention. To characterize metastasis marker genes (MGs) of breast cancer, XGBoost models that classify metastasis status were trained with gene expression profiles from TCGA. Then, a metastasis score (MS) was assigned to each gene by calculating the inner product between the feature importance and the AUC performance of the models. As a result, 54, 202, and 357 genes with the highest MS were characterized as MGs by empirical p-value cutoffs of 0.001, 0.005, and 0.01, respectively. The three sets of MGs were compared with those from existing metastasis marker databases, which provided significant results in most comparisons (p-value < 0.05). They were also significantly enriched in biological processes associated with breast cancer metastasis. The three MGs, SPPL2C, KRT23, and RGS7, showed highly significant results (p-value < 0.01) in the survival analysis. The MGs that could not be identified by statistical analysis (e.g., GOLM1, ELAVL1, UBP1, and AZGP1), as well as the MGs with the highest MS (e.g., ZNF676, FAM163B, LDOC2, IRF1, and STK40), were verified via the literature. Additionally, we checked how close the MGs were to each other in the protein-protein interaction networks. We expect that the characterized markers will help understand and prevent breast cancer metastasis.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Prognostic factor analysis for breast cancer using gene expression profiles
    Joe, Soobok
    Nam, Hojung
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2016, 16
  • [32] A predictive model for distant metastasis in breast cancer patients using machine learning
    Kim, I.
    Choi, H. J.
    Ryu, J. M.
    Lee, S. K.
    Yu, J. H.
    Kim, S. W.
    Nam, S. J.
    Seo, S. W.
    Lee, J. E.
    CANCER RESEARCH, 2019, 79 (04)
  • [33] Prognostic factor analysis for breast cancer using gene expression profiles
    Soobok Joe
    Hojung Nam
    BMC Medical Informatics and Decision Making, 16
  • [34] Personalized chemotherapy selection for breast cancer using gene expression profiles
    Kaixian Yu
    Qing-Xiang Amy Sang
    Pei-Yau Lung
    Winston Tan
    Ty Lively
    Cedric Sheffield
    Mayassa J. Bou-Dargham
    Jun S. Liu
    Jinfeng Zhang
    Scientific Reports, 7
  • [35] Personalized chemotherapy selection for breast cancer using gene expression profiles
    Yu, Kaixian
    Sang, Qing-Xiang Amy
    Lung, Pei-Yau
    Tan, Winston
    Lively, Ty
    Sheffield, Cedric
    Bou-Dargham, Mayassa J.
    Liu, Jun S.
    Zhang, Jinfeng
    SCIENTIFIC REPORTS, 2017, 7
  • [36] Predicting diagnosis and survival of bone metastasis in breast cancer using machine learning
    Xugang Zhong
    Yanze Lin
    Wei Zhang
    Qing Bi
    Scientific Reports, 13
  • [37] Prediction of survival and metastasis in breast cancer patients using machine learning classifiers
    Tapak, Leili
    Shirmohammadi-Khorram, Nasrin
    Amini, Payam
    Alafchi, Behnaz
    Hamidi, Omid
    Poorolajal, Jalal
    CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH, 2019, 7 (03): : 293 - 299
  • [38] Predicting diagnosis and survival of bone metastasis in breast cancer using machine learning
    Zhong, Xugang
    Lin, Yanze
    Zhang, Wei
    Bi, Qing
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [39] Analysis of gene expression involved in brain metastasis from breast cancer using cDNA microarray
    Nishizuka I.
    Ishikawa T.
    Hamaguchi Y.
    Kamiyama M.
    Ichikawa Y.
    Kadota K.
    Miki R.
    Tomaru Y.
    Mizuno Y.
    Tominaga N.
    Yano R.
    Goto H.
    Nitanda H.
    Togo S.
    Okazaki Y.
    Hayashizaki Y.
    Shimada H.
    Breast Cancer, 2002, 9 (1) : 26 - 32
  • [40] Identification of Prognostic Markers in Cholangiocarcinoma Using Altered DNA Methylation and Gene Expression Profiles
    Mishra, Nitish Kumar
    Niu, Meng
    Southekal, Siddesh
    Bajpai, Prachi
    Elkholy, Amr
    Manne, Upender
    Guda, Chittibabu
    FRONTIERS IN GENETICS, 2020, 11