Benefits of dimension reduction in penalized regression methods for high-dimensional grouped data: a case study in low sample size

被引:13
|
作者
Ajana, Soufiane [1 ]
Acar, Niyazi [2 ]
Bretillon, Lionel [2 ]
Hejblum, Boris P. [3 ,4 ]
Jacqmin-Gadda, Helene [5 ]
Delcourt, Cecile [1 ]
Berdeaux, Olivier [2 ]
Bouton, Sylvain [6 ]
Bron, Alain [2 ,7 ]
Buaud, Benjamin [8 ]
Cabaret, Stephanie [2 ]
Cougnard-Gregorie, Audrey [1 ]
Creuzot-Garcher, Catherine [2 ,7 ]
Delyfer, Marie-Noelle [1 ,9 ]
Feart-Couret, Catherine [1 ]
Febvret, Valerie [2 ]
Gregoire, Stephane [2 ]
He, Zhiguo [10 ]
Korobelnik, Jean-Francois [1 ,9 ]
Martine, Lucy [2 ]
Merle, Benedicte [1 ]
Vaysse, Carole [8 ]
机构
[1] Univ Bordeaux, INSERM, Bordeaux Populat Hlth Res Ctr, Team LEHA,UMR 1219, F-33000 Bordeaux, France
[2] Univ Bourgogne Franche Comte, AgroSup Dijon, Ctr Sci Gout & Alimentat, CNRS,INRA, Dijon, France
[3] Univ Bordeaux, INSERM, Bordeaux Populat Hlth Res Ctr 1219, ISPED,Inria SISTM, F-33000 Bordeaux, France
[4] Hop Henri Mondor, VRI, Creteil, France
[5] Univ Bordeaux, Team Biostat, UMR 1219, INSERM,Bordeaux Populat Hlth Res Ctr, F-33000 Bordeaux, France
[6] Lab Thea, Clermont Ferrand, France
[7] Univ Hosp, Dept Ophthalmol, Dijon, France
[8] Equipe Nutr Metab & Sante, ITERG, Bordeaux, France
[9] CHU Bordeaux, Serv Ophtalmol, F-33000 Bordeaux, France
[10] Univ Jean Monnet, Fac Med, EA2521, Lab Biol Imaging & Engn Corneal Grafts, St Etienne, France
关键词
LEAST-SQUARES REGRESSION; VARIABLE SELECTION; CROSS-VALIDATION; FATTY-ACID; REGULARIZATION; IDENTIFICATION; RETINA; ERROR; GENE; PART;
D O I
10.1093/bioinformatics/btz135
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In some prediction analyses, predictors have a natural grouping structure and selecting predictors accounting for this additional information could be more effective for predicting the outcome accurately. Moreover, in a high dimension low sample size framework, obtaining a good predictive model becomes very challenging. The objective of this work was to investigate the benefits of dimension reduction in penalized regression methods, in terms of prediction performance and variable selection consistency, in high dimension low sample size data. Using two real datasets, we compared the performances of lasso, elastic net, group lasso, sparse group lasso, sparse partial least squares (PLS), group PLS and sparse group PLS. Results: Considering dimension reduction in penalized regression methods improved the prediction accuracy. The sparse group PLS reached the lowest prediction error while consistently selecting a few predictors from a single group.
引用
收藏
页码:3628 / 3634
页数:7
相关论文
共 50 条
  • [1] Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data
    Gui, J
    Li, HZ
    [J]. BIOINFORMATICS, 2005, 21 (13) : 3001 - 3008
  • [2] A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
    Adrielle C. Santana
    Adriano V. Barbosa
    Hani C. Yehia
    Rafael Laboissière
    [J]. BMC Neuroscience, 22
  • [3] A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
    Santana, Adrielle C.
    Barbosa, Adriano V.
    Yehia, Hani C.
    Laboissiere, Rafael
    [J]. BMC NEUROSCIENCE, 2021, 22 (01)
  • [4] Accounting for grouped predictor variables or pathways in high-dimensional penalized Cox regression models
    Shaima Belhechmi
    Riccardo De Bin
    Federico Rotolo
    Stefan Michiels
    [J]. BMC Bioinformatics, 21
  • [5] Accounting for grouped predictor variables or pathways in high-dimensional penalized Cox regression models
    Belhechmi, Shaima
    De Bin, Riccardo
    Rotolo, Federico
    Michiels, Stefan
    [J]. BMC BIOINFORMATICS, 2020, 21 (01)
  • [6] Performance Comparison of Penalized Regression Methods in Poisson Regression under High-Dimensional Sparse Data with Multicollinearity
    Choosawat, Chutikarn
    Reangsephet, Orawan
    Srisuradetchai, Patchanok
    Lisawadi, Supranee
    [J]. THAILAND STATISTICIAN, 2020, 18 (03): : 306 - 318
  • [7] DOUBLY PENALIZED ESTIMATION IN ADDITIVE REGRESSION WITH HIGH-DIMENSIONAL DATA
    Tan, Zhiqiang
    Zhang, Cun-Hui
    [J]. ANNALS OF STATISTICS, 2019, 47 (05): : 2567 - 2600
  • [8] Reproducibility and Sample Size in High-Dimensional Data
    Seo, Won Seok
    Choi, Jeea
    Jeong, Hyeong Chul
    Cho, HyungJun
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2010, 23 (06) : 1067 - 1080
  • [9] Penalized weighted smoothed quantile regression for high-dimensional longitudinal data
    Song, Yanan
    Han, Haohui
    Fu, Liya
    Wang, Ting
    [J]. STATISTICS IN MEDICINE, 2024, 43 (10) : 2007 - 2042
  • [10] Penalized Gaussian Process Regression and Classification for High-Dimensional Nonlinear Data
    Yi, G.
    Shi, J. Q.
    Choi, T.
    [J]. BIOMETRICS, 2011, 67 (04) : 1285 - 1294