pETM: a penalized Exponential Tilt Model for analysis of correlated high-dimensional DNA methylation data

被引:13
|
作者
Sun, Hokeun [1 ]
Wang, Ya [2 ]
Chen, Yong [3 ]
Li, Yun [4 ,5 ,6 ]
Wang, Shuang [2 ]
机构
[1] Pusan Natl Univ, Dept Stat, Busan 609735, South Korea
[2] Columbia Univ, Mailman Sch Publ Hlth, Dept Biostat, New York, NY 10032 USA
[3] Univ Penn, Perelman Sch Med, Div Biostat, Philadelphia, PA 19103 USA
[4] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[5] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[6] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC 27599 USA
基金
新加坡国家研究基金会;
关键词
OVARIAN-CANCER; REGULARIZATION PATHS; LUNG-CANCER; CELL; GENES; EXPRESSION; IDENTIFICATION; REGRESSION; MARKERS; HYPERMETHYLATION;
D O I
10.1093/bioinformatics/btx064
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: DNA methylation plays an important role in many biological processes and cancer progression. Recent studies have found that there are also differences in methylation variations in different groups other than differences in methylation means. Several methods have been developed that consider both mean and variance signals in order to improve statistical power of detecting differentially methylated loci. Moreover, as methylation levels of neighboring CpG sites are known to be strongly correlated, methods that incorporate correlations have also been developed. We previously developed a network-based penalized logistic regression for correlated methylation data, but only focusing on mean signals. We have also developed a generalized exponential tilt model that captures both mean and variance signals but only examining one CpG site at a time. Results: In this article, we proposed a penalized Exponential Tilt Model (pETM) using network-based regularization that captures both mean and variance signals in DNA methylation data and takes into account the correlations among nearby CpG sites. By combining the strength of the two models we previously developed, we demonstrated the superior power and better performance of the pETM method through simulations and the applications to the 450K DNA methylation array data of the four breast invasive carcinoma cancer subtypes from The Cancer Genome Atlas (TCGA) project. The developed pETM method identifies many cancer-related methylation loci that were missed by our previously developed method that considers correlations among nearby methylation loci but not variance signals.
引用
收藏
页码:1765 / 1772
页数:8
相关论文
共 50 条
  • [31] High-dimensional inference for linear model with correlated errors
    Panxu Yuan
    Xiao Guo
    Metrika, 2022, 85 : 21 - 52
  • [32] Penalized Bayesian forward continuation ratio model with application to high-dimensional data with discrete survival outcomes
    Seffernick, Anna Eames
    Archer, Kellie J.
    PLOS ONE, 2024, 19 (03):
  • [33] Integrated analysis of DNA-methylation and gene expression using high-dimensional penalized regression: a cohort study on bone mineral density in postmenopausal women
    Lien, Tonje G.
    Borgan, Ornulf
    Reppe, Sjur
    Gautvik, Kaare
    Glad, Ingrid Kristine
    BMC MEDICAL GENOMICS, 2018, 11
  • [34] Integrated analysis of DNA-methylation and gene expression using high-dimensional penalized regression: a cohort study on bone mineral density in postmenopausal women
    Tonje G. Lien
    Ørnulf Borgan
    Sjur Reppe
    Kaare Gautvik
    Ingrid Kristine Glad
    BMC Medical Genomics, 11
  • [35] Supervised Classification of High-Dimensional Correlated Data: Application to Genomic Data
    Gaye, Aboubacry
    Diongue, Abdou Ka
    Sylla, Seydou Nourou
    Diarra, Maryam
    Diallo, Amadou
    Talla, Cheikh
    Loucoubar, Cheikh
    JOURNAL OF CLASSIFICATION, 2024, 41 (01) : 158 - 169
  • [36] Supervised Classification of High-Dimensional Correlated Data: Application to Genomic Data
    Aboubacry Gaye
    Abdou Ka Diongue
    Seydou Nourou Sylla
    Maryam Diarra
    Amadou Diallo
    Cheikh Talla
    Cheikh Loucoubar
    Journal of Classification, 2024, 41 : 158 - 169
  • [37] Two-group classification with high-dimensional correlated data: A factor model approach
    Pedro Duarte Silva, A.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (11) : 2975 - 2990
  • [38] Model Selection for High-Dimensional Data
    Owrang, Arash
    Jansson, Magnus
    2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 606 - 609
  • [39] Model averaging in calibration of near-infrared instruments with correlated high-dimensional data
    Salaki, Deiby Tineke
    Kurnia, Anang
    Sartono, Bagus
    Mangku, I. Wayan
    Gusnanto, Arief
    JOURNAL OF APPLIED STATISTICS, 2024, 51 (02) : 279 - 297
  • [40] Targeted Inference Involving High-Dimensional Data Using Nuisance Penalized Regression
    Sun, Qiang
    Zhang, Heping
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (535) : 1472 - 1486