Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent

被引:17
|
作者
Klosa, Jan [1 ]
Simon, Noah [2 ]
Westermark, Pal Olof [1 ]
Liebscher, Volkmar [3 ]
Wittenburg, Doerte [1 ]
机构
[1] Leibniz Inst Farm Anim Biol, Inst Genet & Biometry, D-18196 Dummerstorf, Germany
[2] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
[3] Univ Greifswald, Inst Math & Comp Sci, D-17489 Greifswald, Germany
关键词
Optimization; Machine learning; High-dimensional data; R package; SELECTION;
D O I
10.1186/s12859-020-03725-w
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally,seagull-the R package presented here- produces complete regularization paths. Results Publicly available high-dimensional methylation data are used to compareseagullto the established R packageSGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results ofseagullandSGLwere very similar (R-2 > 0.99),seagullcomputed the solution in a fraction of the time needed bySGL. Additionally,seagullenables the incorporation of weights for each penalized feature. Conclusions The following operators for linear regression models are available inseagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus,seagullis a convenient envelope of lasso variants.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] On the asymptotic properties of the group lasso estimator for linear models
    Nardi, Yuval
    Rinaldo, Alessandro
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2008, 2 : 605 - 633
  • [42] Smoothing composite proximal gradient algorithm for sparse group Lasso problems with nonsmooth loss functions
    Shen, Huiling
    Peng, Dingtao
    Zhang, Xian
    [J]. JOURNAL OF APPLIED MATHEMATICS AND COMPUTING, 2024, 70 (03) : 1887 - 1913
  • [43] Stochastic configuration networks with group lasso regularization
    Wang, Yang
    Yang, Guanci
    Zhang, Chenglong
    Wu, Yongming
    [J]. INFORMATION SCIENCES, 2024, 677
  • [44] Sparse Sliced Inverse Regression via Lasso
    Lin, Qian
    Zhao, Zhigen
    Liu, Jun S.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2019, 114 (528) : 1726 - 1739
  • [45] ESTIMATION OF SPARSE FUNCTIONAL ADDITIVE MODELS WITH ADAPTIVE GROUP LASSO
    Sang, Peijun
    Wang, Liangliang
    Cao, Jiguo
    [J]. STATISTICA SINICA, 2020, 30 (03) : 1191 - 1211
  • [46] Sparse EEG/MEG source estimation via a group lasso
    Lim, Michael
    Ales, Justin M.
    Cottereau, Benoit R.
    Hastie, Trevor
    Norcia, Anthony M.
    [J]. PLOS ONE, 2017, 12 (06):
  • [47] A group lasso based sparse KNN classifier
    Zheng, Shuai
    Ding, Chris
    [J]. PATTERN RECOGNITION LETTERS, 2020, 131 (131) : 227 - 233
  • [48] Coordinate Descent Algorithm for Normal-Likelihood-Based Group Lasso in Multivariate Linear Regression
    Yanagihara, Hirokazu
    Oda, Ryoya
    [J]. INTELLIGENT DECISION TECHNOLOGIES, KES-IDT 2021, 2021, 238 : 429 - 439
  • [49] A note on coding and standardization of categorical variables in (sparse) group lasso regression
    Detmer, Felicitas J.
    Cebral, Juan
    Slawski, Martin
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2020, 206 : 1 - 11
  • [50] Asymptotic theory of the adaptive Sparse Group Lasso
    Benjamin Poignard
    [J]. Annals of the Institute of Statistical Mathematics, 2020, 72 : 297 - 328