Penalized logistic regression based on L1/2 penalty for high-dimensional DNA methylation data

被引:4
|
作者
Jiang, Hong-Kun [1 ]
Liang, Yong [1 ]
机构
[1] Macau Univ Sci & Technol, Fac Informat Technol, Ave Wai Long, Taipa 999078, Macau, Peoples R China
关键词
DNA methylation; CpG island; L-1/2 regularization method; gene regulatory network; variable selection; BREAST-CANCER; VARIABLE SELECTION; REGULARIZATION; PROLIFERATION; POLYMORPHISM; TARGET; GENES; MPO;
D O I
10.3233/THC-209016
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BACKGROUND: DNA methylation is a molecular modification of DNA that is vital and occurs in gene expression. In cancer tissues, the 5'-C-phosphate-G-3'(CpG) rich regions are abnormally hypermethylated or hypomethylated. Therefore, it is useful to find out the diseased CpG sites by employing specific methods. CpG sites are highly correlated with each other within the same gene or the same CpG island. OBJECTIVE: Based on this group effect, we proposed an efficient and accurate method for selecting pathogenic CpG sites. METHODS: Our method aimed to combine a L-1/2 regularized solver and a central node fully connected network to penalize group constrained logistic regression model. Consequently, both sparsity and group effect were brought in with respect to the correlated regression coefficients. RESULTS: Extensive simulation studies were used to compare our proposed approach with existing mainstream regularization in respect of classification accuracy and stability. The simulation results show that a greater predictive accuracy was attained in comparison to previous methods. Furthermore, our method was applied to over 20000 CpG sites and verified using the ovarian cancer data generated from Illumina Infinium HumanMethylation 27K Beadchip. In the result of the real dataset, not only the indicators of predictive accuracy are higher than the previous methods, but also more CpG sites containing genes are confirmed pathogenic. Additionally, the total number of CpG sites chosen is less than other methods and the results show higher accuracy rates in comparison to other methods in simulation and DNA methylation data. CONCLUSION: The proposed method offers an advanced tool to researchers in DNA methylation and can be a powerful tool for recognizing pathogenic CpG sites.
引用
收藏
页码:S161 / S171
页数:11
相关论文
共 50 条
  • [1] Penalized logistic regression for high-dimensional DNA methylation data with case-control studies
    Sun, Hokeun
    Wang, Shuang
    [J]. BIOINFORMATICS, 2012, 28 (10) : 1368 - 1375
  • [2] L1 Correlation-Based Penalty in High-Dimensional Quantile Regression
    Yuzbasi, Bahadir
    Ahmed, S. Ejaz
    Asar, Yasin
    [J]. 2018 4TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS (BIGDIA), 2018,
  • [3] Penalized regression combining the L1 norm and a correlation based penalty
    Anbari M.E.
    Mkhadri A.
    [J]. Sankhya B, 2014, 76 (1) : 82 - 102
  • [4] Improving Penalized Logistic Regression Model with Missing Values in High-Dimensional Data
    Alharthi, Aiedh Mrisi
    Lee, Muhammad Hisyam
    Algamal, Zakariya Yahya
    [J]. INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2022, 18 (02) : 40 - 54
  • [5] Penalized high-dimensional M-quantile regression: From L1 to Lp optimization
    Hu, Jie
    Chen, Yu
    Zhang, Weiping
    Guo, Xiao
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2021, 49 (03): : 875 - 905
  • [6] High-dimensional QSAR modelling using penalized linear regression model with L1/2-norm
    Algamal, Z. Y.
    Lee, M. H.
    Al-Fakih, A. M.
    Aziz, M.
    [J]. SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2016, 27 (09) : 703 - 719
  • [7] The L1 penalized LAD estimator for high dimensional linear regression
    Wang, Lie
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 120 : 135 - 151
  • [8] Ensemble of penalized logistic models for classification of high-dimensional data
    Ijaz, Musarrat
    Asghar, Zahid
    Gul, Asma
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (07) : 2072 - 2088
  • [9] High-dimensional mean estimation via l1 penalized normal likelihood
    Katayama, Shota
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 130 : 90 - 106
  • [10] l1-PENALIZED QUANTILE REGRESSION IN HIGH-DIMENSIONAL SPARSE MODELS
    Belloni, Alexandre
    Chernozhukov, Victor
    [J]. ANNALS OF STATISTICS, 2011, 39 (01): : 82 - 130