Weighted Lasso with Data Integration

被引:34
|
作者
Bergersen, Linn Cecilie [1 ]
Glad, Ingrid K. [1 ]
Lyng, Heidi [2 ]
机构
[1] Univ Oslo, N-0316 Oslo, Norway
[2] Norwegian Radium Hosp, Tromso, Norway
关键词
adaptive lasso; cervix cancer; copy number alterations; data integration; gene expressions; head and neck cancer; Lasso; p >> n; penalized regression; prediction; variable selection; weighted lasso; GENOME-WIDE ASSOCIATION; VARIABLE SELECTION; ADAPTIVE LASSO; DANTZIG SELECTOR; REGRESSION; METASTASIS; SHRINKAGE; NETWORK; CANCER; MODEL;
D O I
10.2202/1544-6115.1703
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The lasso is one of the most commonly used methods for high-dimensional regression, but can be unstable and lacks satisfactory asymptotic properties for variable selection. We propose to use weighted lasso with integrated relevant external information on the covariates to guide the selection towards more stable results. Weighting the penalties with external information gives each regression coefficient a covariate specific amount of penalization and can improve upon standard methods that do not use such information by borrowing knowledge from the external material. The method is applied to two cancer data sets, with gene expressions as covariates. We find interesting gene signatures, which we are able to validate. We discuss various ideas on how the weights should be defined and illustrate how different types of investigations can utilize our method exploiting different sources of external data. Through simulations, we show that our method outperforms the lasso and the adaptive lasso when the external information is from relevant to partly relevant, in terms of both variable selection and prediction.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] A Data-Dependent Weighted LASSO Under Poisson Noise
    Hunt, Xin Jiang
    Reynaud-Bouret, Patricia
    Rivoirard, Vincent
    Sansonnet, Laure
    Willett, Rebecca
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (03) : 1589 - 1613
  • [2] Tailored graphical lasso for data integration in gene network reconstruction
    Lingjaerde, Camilla
    Lien, Tonje G.
    Borgan, Ornulf
    Bergholtz, Helga
    Glad, Ingrid K.
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)
  • [3] Tailored graphical lasso for data integration in gene network reconstruction
    Camilla Lingjærde
    Tonje G. Lien
    Ørnulf Borgan
    Helga Bergholtz
    Ingrid K. Glad
    [J]. BMC Bioinformatics, 22
  • [4] On the Complexity of the Weighted Fused Lasso
    Bento, Jose
    Furmaniak, Ralph
    Ray, Surjyendu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (10) : 1595 - 1599
  • [5] Data Science in Stata 16: Frames, Lasso, and Python']Python Integration
    Ho, Anson T. Y.
    Huynh, Kim P.
    Jacho-Chavez, David T.
    Rojas-Baez, Diego
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2021, 98 (SR1): : 1 - 9
  • [6] Weighted-LASSO for Structured Network Inference from Time Course Data
    Charbonnier, Camille
    Chiquet, Julien
    Ambroise, Christophe
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2010, 9 (01)
  • [7] Lag weighted lasso for time series model
    Heewon Park
    Fumitake Sakaori
    [J]. Computational Statistics, 2013, 28 : 493 - 504
  • [8] Lag weighted lasso for time series model
    Park, Heewon
    Sakaori, Fumitake
    [J]. COMPUTATIONAL STATISTICS, 2013, 28 (02) : 493 - 504
  • [9] Weighted Lasso subsampling for high dimensional regression
    Uraibi, Hassan S.
    [J]. ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2019, 12 (01) : 69 - 84
  • [10] Weighted lasso in graphical gaussian modeling for large gene network estimation based on microarray data
    Shimamura, Teppei
    Imoto, Seiya
    Yamaguchi, Rui
    Miyano, Satoru
    [J]. GENOME INFORMATICS 2007, VOL 19, 2007, 19 : 142 - 153