Simultaneous variable selection and smoothing for high-dimensional function-on-scalar regression

被引:12
|
作者
Parodi, Alice [1 ]
Reimherr, Matthew [2 ]
机构
[1] Politecn Milan, MOX Dept Math, Milan, Italy
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2018年 / 12卷 / 02期
关键词
Nonlinear regression; variable selection; functional data analysis; reproducing kernel Hilbert space; minimax convergence; VARYING-COEFFICIENT MODELS; ADAPTIVE LASSO; CHILDHOOD;
D O I
10.1214/18-EJS1509
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present a new methodology, called FLAME, which simultaneously selects important predictors and produces smooth estimates in a function-on-scalar linear model with a large number of scalar predictors. Our framework applies quite generally by viewing the functional outcomes as elements of an arbitrary real separable Hilbert space. To select important predictors while also producing smooth parameter estimates, we utilize operators to define subspaces that are imbued with certain desirable properties as determined by the practitioner and the setting, such as smoothness or periodicity. In special cases one can show that these subspaces correspond to Reproducing Kernel Hilbert Spaces, however our methodology applies more broadly. We provide a very fast algorithm for computing the estimators, which is based on a functional coordinate descent, and an B. package, flm, whose backend is written in C++. Asymptotic properties of the estimators are developed and simulations are provided to illustrate the advantages of FLAME over existing methods, both in terms of statistical performance and computational efficiency. We conclude with an application to childhood asthma, where we find a potentially important genetic mutation that was not selected by previous functional data based methods.
引用
下载
收藏
页码:4602 / 4639
页数:38
相关论文
共 50 条
  • [31] Variable selection for high-dimensional regression models with time series and heteroscedastic errors
    Chiou, Hai-Tang
    Guo, Meihui
    Ing, Ching-Kang
    JOURNAL OF ECONOMETRICS, 2020, 216 (01) : 118 - 136
  • [32] Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression
    B. Sun
    Q. Y. Cai
    Z. K. Peng
    C. M. Cheng
    F. Wang
    H. Z. Zhang
    Nonlinear Dynamics, 2023, 111 : 12101 - 12112
  • [33] An Additive Sparse Penalty for Variable Selection in High-Dimensional Linear Regression Model
    Lee, Sangin
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2015, 22 (02) : 147 - 157
  • [34] Bayesian variable selection and model averaging in high-dimensional multinomial nonparametric regression
    Yau, P
    Kohn, R
    Wood, S
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2003, 12 (01) : 23 - 54
  • [35] Correlation-adjusted regression survival scores for high-dimensional variable selection
    Welchowski, Thomas
    Zuber, Verena
    Schmid, Matthias
    STATISTICS IN MEDICINE, 2019, 38 (13) : 2413 - 2427
  • [36] Variable selection in high-dimensional regression: a nonparametric procedure for business failure prediction
    Amendola, Alessandra
    Giordano, Francesco
    Parrella, Maria Lucia
    Restaino, Marialuisa
    APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2017, 33 (04) : 355 - 368
  • [37] Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression
    Sun, B.
    Cai, Q. Y.
    Peng, Z. K.
    Cheng, C. M.
    Wang, F.
    Zhang, H. Z.
    NONLINEAR DYNAMICS, 2023, 111 (13) : 12101 - 12112
  • [38] Online robust estimation and bootstrap inference for function-on-scalar regression
    Guanghui Cheng
    Wenjuan Hu
    Ruitao Lin
    Chen Wang
    Statistics and Computing, 2025, 35 (1)
  • [39] Optimal function-on-scalar regression over complex domains
    Reimherr, Matthew
    Sriperumbudur, Bharath
    Bin Kang, Hyun
    ELECTRONIC JOURNAL OF STATISTICS, 2023, 17 (01): : 156 - 197
  • [40] Simultaneous variable selection and class fusion for high-dimensional linear discriminant analysis
    Guo, Jian
    BIOSTATISTICS, 2010, 11 (04) : 599 - 608