Sparse Partially Linear Additive Models

被引:38
|
作者
Lou, Yin [1 ]
Bien, Jacob [2 ,3 ]
Caruana, Rich [4 ]
Gehrke, Johannes [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14850 USA
[2] Cornell Univ, Dept Biol Stat & Computat Biol, Ithaca, NY 14850 USA
[3] Cornell Univ, Dept Stat Sci, Ithaca, NY 14850 USA
[4] Microsoft Corp, Microsoft Res, Redmond, WA 98052 USA
关键词
Classification; Generalized partially linear additive models; Group lasso; Regression; Sparsity; SELECTION; REGRESSION; SHRINKAGE;
D O I
10.1080/10618600.2015.1089775
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The generalized partially linear additive model (GPLAM) is a flexible and interpretable approach to building predictive models. It combines features in an additive manner, allowing each to have either a linear or nonlinear effect on the response. However, the choice of which features to treat as linear or nonlinear is typically assumed known. Thus, to make a GPLAM a viable approach in situations in which little is known a priori about the features, one must overcome two primary model selection challenges: deciding which features to include in the model and determining which of these features to treat nonlinearly. We introduce the sparse partially linear additive model (SPLAM), which combines model fitting and both of these model selection challenges into a single convex optimization problem. SPLAM provides a bridge between the lasso and sparse additive models. Through a statistical oracle inequality and thorough simulation, we demonstrate that SPLAM can outperform other methods across a broad spectrum of statistical regimes, including the high-dimensional (p >> N) setting. We develop efficient algorithms that are applied to real datasets with half a million samples and over 45,000 features with excellent predictive performance. Supplementary materials for this article are available online.
引用
收藏
页码:1026 / 1040
页数:15
相关论文
共 50 条
  • [1] Partially Linear Additive Models
    Toledo, Camila G.
    Lopes, Joysce S.
    Ferreira, Clecio S.
    [J]. SIGMAE, 2024, 13 (01): : 24 - 31
  • [2] Sparse model identification and learning for ultra-high-dimensional additive partially linear models
    Li, Xinyi
    Wang, Li
    Nettleton, Dan
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2019, 173 : 204 - 228
  • [3] Partially Linear Additive Gaussian Graphical Models
    Geng, Sinong
    Yan, Minhao
    Kolar, Mladen
    Koyejo, Oluwasanmi
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [4] Efficient estimation of additive partially linear models
    Li, Q
    [J]. INTERNATIONAL ECONOMIC REVIEW, 2000, 41 (04) : 1073 - 1092
  • [5] Remodeling and Estimation for Sparse Partially Linear Regression Models
    Zeng, Yunhui
    Wang, Xiuli
    Lin, Lu
    [J]. ABSTRACT AND APPLIED ANALYSIS, 2013,
  • [6] Fast Sparse Classification for Generalized Linear and Additive Models
    Liu, Jiachang
    Zhong, Chudi
    Seltzer, Margo
    Rudin, Cynthia
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [7] Estimation and inference for mixture of partially linear additive models
    Zhang, Yi
    Pan, Weiquan
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (08) : 2519 - 2533
  • [8] Quantile regression estimation of partially linear additive models
    Hoshino, Tadao
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2014, 26 (03) : 509 - 536
  • [9] Partially Linear Additive Models with Unknown Link Functions
    Zhang, Jun
    Lian, Heng
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2018, 45 (02) : 255 - 282
  • [10] Generalized Partially Linear Additive Models for Credit Scoring
    Shim, Ju-Hyun
    Lee, Young K.
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2011, 24 (04) : 587 - 595