Sparse regression for extreme values

被引:3
|
作者
Chang, Andersen [1 ]
Wang, Minjie [1 ]
Allen, Genevera, I [1 ,2 ,3 ,4 ,5 ]
机构
[1] Rice Univ, Dept Stat, Houston, TX 77251 USA
[2] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77251 USA
[3] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
[4] Baylor Coll Med, Dept Pediat Neurol, Houston, TX 77030 USA
[5] Texas Childrens Hosp, Jan & Dan Duncan Neurol Res Inst, Houston, TX 77030 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2021年 / 15卷 / 02期
关键词
Linear regression; sparse modeling; extreme values; Subbotin distribution; generalized normal distribution; VARIABLE SELECTION; ROBUST REGRESSION; CONSISTENCY; INFERENCE; MODEL;
D O I
10.1214/21-EJS1937
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We study the problem of selecting features associated with extreme values in high dimensional linear regression. Normally, in linear modeling problems, the presence of abnormal extreme values or outliers is considered an anomaly which should either be removed from the data or remedied using robust regression methods. In many situations, however, the extreme values in regression modeling are not outliers but rather the signals of interest; consider traces from spiking neurons, volatility in finance, or extreme events in climate science, for example. In this paper, we propose a new method for sparse high-dimensional linear regression for extreme values which is motivated by the Subbotin, or generalized normal distribution, which we call the extreme value linear regression model. For our method, we utilize an l(p) norm loss where p is an even integer greater than two; we demonstrate that this loss increases the weight on extreme values. We prove consistency and variable selection consistency for the extreme value linear regression with a Lasso penalty, which we term the Extreme Lasso, and we also analyze the theoretical impact of extreme value observations on the model parameter estimates using the concept of influence functions. Through simulation studies and a real-world data example, we show that the Extreme Lasso outperforms other methods currently used in the literature for selecting features of interest associated with extreme values in high-dimensional regression.
引用
收藏
页码:5995 / 6035
页数:41
相关论文
共 50 条
  • [1] Missing values: sparse inverse covariance estimation and an extension to sparse regression
    Staedler, Nicolas
    Buehlmann, Peter
    STATISTICS AND COMPUTING, 2012, 22 (01) : 219 - 235
  • [2] Missing values: sparse inverse covariance estimation and an extension to sparse regression
    Nicolas Städler
    Peter Bühlmann
    Statistics and Computing, 2012, 22 : 219 - 235
  • [3] SMOTEBoost for Regression: Improving the Prediction of Extreme Values
    Moniz, Nuno
    Ribeiro, Rita P.
    Cerqueira, Vitor
    Chawla, Nitesh
    2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 150 - 159
  • [4] Regression-type analysis for multivariate extreme values
    de Carvalho, Miguel
    Kumukova, Alina
    dos Reis, Goncalo
    EXTREMES, 2022, 25 (04) : 595 - 622
  • [5] Regression-type analysis for multivariate extreme values
    Miguel de Carvalho
    Alina Kumukova
    Gonçalo dos Reis
    Extremes, 2022, 25 : 595 - 622
  • [6] Extreme singular values of inhomogeneous sparse random rectangular matrices
    Dumitriu, Ioana
    Zhu, Yizhe
    BERNOULLI, 2024, 30 (04) : 2904 - 2931
  • [7] Inference with Extremes: Accounting for Extreme Values in Count Regression Models
    Randahl, David
    Vegelius, Johan
    INTERNATIONAL STUDIES QUARTERLY, 2024, 68 (04)
  • [8] Sparse multivariate regression with missing values and its application to the prediction of material properties
    Teramoto, Keisuke
    Hirose, Kei
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2022, 123 (02) : 530 - 546
  • [9] Extreme Sparse Multinomial Logistic Regression: A Fast and Robust Framework for Hyperspectral Image Classification
    Cao, Faxian
    Yang, Zhijing
    Ren, Jinchang
    Ling, Wing-Kuen
    Zhao, Huimin
    Marshall, Stephen
    REMOTE SENSING, 2017, 9 (12)
  • [10] Sparse Recursive Least Mean p-Power Extreme Learning Machine for Regression
    Yang, Jing
    Xu, Yi
    Rong, Hai-Jun
    Du, Shaoyi
    Chen, Badong
    IEEE ACCESS, 2018, 6 : 16022 - 16034