Sparse regression for extreme values

被引:3
|
作者
Chang, Andersen [1 ]
Wang, Minjie [1 ]
Allen, Genevera, I [1 ,2 ,3 ,4 ,5 ]
机构
[1] Rice Univ, Dept Stat, Houston, TX 77251 USA
[2] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77251 USA
[3] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
[4] Baylor Coll Med, Dept Pediat Neurol, Houston, TX 77030 USA
[5] Texas Childrens Hosp, Jan & Dan Duncan Neurol Res Inst, Houston, TX 77030 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2021年 / 15卷 / 02期
关键词
Linear regression; sparse modeling; extreme values; Subbotin distribution; generalized normal distribution; VARIABLE SELECTION; ROBUST REGRESSION; CONSISTENCY; INFERENCE; MODEL;
D O I
10.1214/21-EJS1937
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We study the problem of selecting features associated with extreme values in high dimensional linear regression. Normally, in linear modeling problems, the presence of abnormal extreme values or outliers is considered an anomaly which should either be removed from the data or remedied using robust regression methods. In many situations, however, the extreme values in regression modeling are not outliers but rather the signals of interest; consider traces from spiking neurons, volatility in finance, or extreme events in climate science, for example. In this paper, we propose a new method for sparse high-dimensional linear regression for extreme values which is motivated by the Subbotin, or generalized normal distribution, which we call the extreme value linear regression model. For our method, we utilize an l(p) norm loss where p is an even integer greater than two; we demonstrate that this loss increases the weight on extreme values. We prove consistency and variable selection consistency for the extreme value linear regression with a Lasso penalty, which we term the Extreme Lasso, and we also analyze the theoretical impact of extreme value observations on the model parameter estimates using the concept of influence functions. Through simulation studies and a real-world data example, we show that the Extreme Lasso outperforms other methods currently used in the literature for selecting features of interest associated with extreme values in high-dimensional regression.
引用
收藏
页码:5995 / 6035
页数:41
相关论文
共 50 条
  • [21] Adaptive sparse regression
    Figueiredo, MAT
    NONLINEAR ESTIMATION AND CLASSIFICATION, 2003, 171 : 237 - 247
  • [22] Sparse Regression Codes
    Venkataramanan, Ramji
    Tatikonda, Sekhar
    Barron, Andrew
    FOUNDATIONS AND TRENDS IN COMMUNICATIONS AND INFORMATION THEORY, 2019, 15 (1-2): : 1 - 195
  • [23] Sparse Regression by Projection and Sparse Discriminant Analysis
    Qi, Xin
    Luo, Ruiyan
    Carroll, Raymond J.
    Zhao, Hongyu
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2015, 24 (02) : 416 - 438
  • [24] Sparse PCA from Sparse Linear Regression
    Bresler, Guy
    Park, Sung Min
    Persu, Madalina
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [25] Sparse Least Squares Support Vector Regression via Multiresponse Sparse Regression
    Vieira, David Clifte da S.
    Rocha Neto, Ajalmar R.
    Rodrigues, Antonio Wendell de O.
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3218 - 3225
  • [26] Are Latent Factor Regression and Sparse Regression Adequate?
    Fan, Jianqing
    Lou, Zhipeng
    Yu, Mengxin
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (546) : 1076 - 1088
  • [27] P-min-Stable Regression Models for Time Series With Extreme Values of Limited Range
    Nascimento, Leonardo Brandao Freitas
    Lima, Max Sousa
    Duczmal, Luiz H.
    ENVIRONMETRICS, 2025, 36 (02)
  • [28] Detection of rare variant effects in association studies: extreme values, iterative regression, and a hybrid approach
    Zhaogong Zhang
    Qiuying Sha
    Xinli Wang
    Shuanglin Zhang
    BMC Proceedings, 5 (Suppl 9)
  • [29] Extreme logistic regression
    Che Ngufor
    Janusz Wojtusiak
    Advances in Data Analysis and Classification, 2016, 10 : 27 - 52
  • [30] Extreme logistic regression
    Ngufor, Che
    Wojtusiak, Janusz
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2016, 10 (01) : 27 - 52