Regression-Based Network Estimation for High-Dimensional Genetic Data

被引:2
|
作者
Lee, Kyu Min [1 ]
Lee, Minhyeok [2 ]
Seok, Junhee [2 ]
Han, Sung Won [1 ]
机构
[1] Korea Univ, Sch Ind Management Engn, 145 Anam Ro, Seoul 02841, South Korea
[2] Korea Univ, Sch Elect Engn, 145 Anam Ro, Seoul 02841, South Korea
基金
新加坡国家研究基金会;
关键词
adaptive elastic-net; gene network estimation; graphical model; regression-based approach; VARIABLE SELECTION; ADAPTIVE LASSO; HEALTH-CARE; GRAPHS;
D O I
10.1089/cmb.2018.0225
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Given the continuous advancement in genome sequencing technology, large volumes of gene expression data can be easily obtained. However, the corresponding increase in genetic information necessitates adoption of a new approach for network estimation. Data dimensions increase with the progress in genome sequencing technology, thereby making it difficult to estimate gene networks by causing multicollinearity. Furthermore, such a problem also occurs when hub nodes exist, where gene networks are known to have regulator genes that can be interpreted as hub nodes. This study aims at developing methods that demonstrate good performance when handling high-dimensional data with hub nodes. We propose regression-based approaches as feasible solutions in this article. Elastic-net and adaptive elastic-net penalty regressions were applied to compensate for the disadvantages of existing regression-based approaches employing LASSO or adaptive LASSO. Experiments were performed to compare the proposed regression-based approaches with other conventional methods. We confirmed the superior performance of the regression-based approaches and applied it to actual genetic data to verify the suitability to estimate gene networks. As results, robustness of the proposed methods was demonstrated with respect to high-dimensional gene expression data.
引用
收藏
页码:336 / 349
页数:14
相关论文
共 50 条
  • [1] Regression-based heterogeneity analysis to identify overlapping subgroup structure in high-dimensional data
    Luo, Ziye
    Yao, Xinyue
    Sun, Yifan
    Fan, Xinyan
    [J]. BIOMETRICAL JOURNAL, 2022, 64 (06) : 1109 - 1141
  • [2] A joint estimation for the high-dimensional regression modeling on stratified data
    Gao, Yimiao
    Yang, Yuehan
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (12) : 6129 - 6140
  • [3] DOUBLY PENALIZED ESTIMATION IN ADDITIVE REGRESSION WITH HIGH-DIMENSIONAL DATA
    Tan, Zhiqiang
    Zhang, Cun-Hui
    [J]. ANNALS OF STATISTICS, 2019, 47 (05): : 2567 - 2600
  • [4] Improving the accuracy and internal consistency of regression-based clustering of high-dimensional datasets
    Zhang, Bo
    He, Jianghua
    Hu, Jinxiang
    Chalise, Prabhakar
    Koestler, Devin C.
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2023, 22 (01)
  • [5] Converting high-dimensional regression to high-dimensional conditional density estimation
    Izbicki, Rafael
    Lee, Ann B.
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 2800 - 2831
  • [6] Variable selection using Gaussian process regression-based metrics for high-dimensional model approximation with limited data
    Lee, Kyungeun
    Cho, Hyunkyoo
    Lee, Ikjin
    [J]. STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2019, 59 (05) : 1439 - 1454
  • [7] Variable selection using Gaussian process regression-based metrics for high-dimensional model approximation with limited data
    Kyungeun Lee
    Hyunkyoo Cho
    Ikjin Lee
    [J]. Structural and Multidisciplinary Optimization, 2019, 59 : 1439 - 1454
  • [8] Inverse regression-based uncertainty quantification algorithms for high-dimensional models: Theory and practice
    Li, Weixuan
    Lin, Guang
    Li, Bing
    [J]. JOURNAL OF COMPUTATIONAL PHYSICS, 2016, 321 : 259 - 278
  • [9] Variance estimation for high-dimensional regression models
    Spokoiny, V
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2002, 82 (01) : 111 - 133
  • [10] Bayesian regression based on principal components for high-dimensional data
    Lee, Jaeyong
    Oh, Hee-Seok
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 117 : 175 - 192