DNA methylation loci identification for pan-cancer early-stage diagnosis and prognosis using a new distributed parallel partial least squares method

被引:0
|
作者
He, Qi-en [1 ]
Zhu, Jun-xuan [1 ]
Wang, Li-yan [1 ]
Ding, En-ci [2 ]
Song, Kai [1 ]
机构
[1] Tianjin Univ, Sch Chem Engn & Technol, Tianjin, Peoples R China
[2] Tianjin First Cent Hosp, Tianjin, Peoples R China
关键词
DNA methylation; partial least squares; MapReduce; pan-cancer analysis; early-stage tumor diagnosis and prognosis; SURVIVAL ANALYSIS; EXPRESSION; TOOL;
D O I
10.3389/fgene.2022.940214
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Aberrant methylation is one of the early detectable events in many tumors, which is very promising for pan-cancer early-stage diagnosis and prognosis. To efficiently analyze the big pan-cancer methylation data and to overcome the co-methylation phenomenon, a MapReduce-based distributed and parallel-designed partial least squares approach was proposed. The large-scale high-dimensional methylation data were first decomposed into distributed blocks according to their genome locations. A distributed and parallel data processing strategy was proposed based on the framework of MapReduce, and then latent variables were further extracted for each distributed block. A set of pan-cancer signatures through a differential co-expression network followed by statistical tests was further identified based on their gene expression profiles. In total, 15 TCGA and 3 GEO datasets were used as the training and testing data, respectively, to verify our method. As a result, 22,000 potential methylation loci were selected as highly related loci with early-stage pan-cancer diagnosis. Of these, 67 methylation loci were further identified as pan-cancer signatures considering their gene expression as well. The survival analysis as well as pathway enrichment analysis on them shows that not only these loci may serve as potential drug targets, but also the proposed method may serve as a uniform framework for signature identification with big data.
引用
收藏
页数:13
相关论文
共 2 条
  • [1] Non-invasive diagnosis of early-stage lung cancer using high-throughput targeted DNA methylation sequencing of circulating tumor DNA (ctDNA)
    Liang, Wenhua
    Zhao, Yue
    Huang, Weizhe
    Gao, Yangbin
    Xu, Weihong
    Tao, Jinsheng
    Yang, Meng
    Li, Lequn
    Ping, Wei
    Shen, Hui
    Fu, Xiangning
    Chen, Zhiwei
    Laird, Peter W.
    Cai, Xuyu
    Fan, Jian-Bing
    He, Jianxing
    [J]. THERANOSTICS, 2019, 9 (07): : 2056 - 2070
  • [2] A New Method for Preliminary Identification of Gene Regulatory Networks from Gene Microarray Cancer Data Using Ridge Partial Least Squares With Recursive Feature Elimination and Novel Brier and Occurrence Probability Measures
    Chan, S. C.
    Wu, H. C.
    Tsui, K. M.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2012, 42 (06): : 1514 - 1528