Robust Methods for High-Dimensional Linear Learning

被引：0

作者：

Merad, Ibrahim ^{[1
]}

Gaiffas, Stephane ^{[1
,2
]}

机构：

[1] Univ Paris Diderot, LPSM, UMR 8001, Paris, France

[2] Ecole Normale Super, DMA, Paris, France

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2023年 / 24卷

关键词：

robust methods; heavy-tailed data; outliers; sparse recovery; mirror descent; general; ization error; VARIABLE SELECTION; REGRESSION SHRINKAGE; ORACLE INEQUALITIES; LASSO; ESTIMATORS; SPARSITY; SLOPE; REGULARIZATION; RECOVERY; BOUNDS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose statistically robust and computationally efficient linear learning methods in the highdimensional batch setting, where the number of features d may exceed the sample size n. We employ, in a generic learning setting, two algorithms depending on whether the considered loss function is gradient-Lipschitz or not. Then, we instantiate our framework on several applications including vanilla sparse, group-sparse and low-rank matrix recovery. This leads, for each application, to efficient and robust learning algorithms, that reach near-optimal estimation rates under heavy-tailed distributions and the presence of outliers. For vanilla s-sparsity, we are able to reach the s log(d)/n rate under heavy-tails and eta-corruption, at a computational cost comparable to that of non-robust analogs. We provide an efficient implementation of our algorithms in an open-source Python library called linlearn, by means of which we carry out numerical experiments which confirm our theoretical findings together with a comparison to other recent approaches proposed in the literature.

引用

页数：44

共 50 条

[21] High-dimensional linear discriminant analysis using nonparametric methods
Park, Hoyoung
Baek, Seungchul
Park, Junyong
JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 188
[22] Robust Hessian Locally Linear Embedding Techniques for High-Dimensional Data
Xing, Xianglei
Du, Sidan
Wang, Kejun
ALGORITHMS, 2016, 9 (02):
[23] Robust Inference for High-Dimensional Linear Models via Residual Randomization
Wang, Y. Samuel
Lee, Si Kai
Toulis, Panos
Kolar, Mladen
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7818 - 7828
[24] Robust and consistent variable selection in high-dimensional generalized linear models
Avella-Medina, Marco
Ronchetti, Elvezio
BIOMETRIKA, 2018, 105 (01) : 31 - 44
[25] Transfer Learning Under High-Dimensional Generalized Linear Models
Tian, Ye
Feng, Yang
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (544) : 2684 - 2697
[26] Sparsity Oriented Importance Learning for High-Dimensional Linear Regression
Ye, Chenglong
Yang, Yi
Yang, Yuhong
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (524) : 1797 - 1812
[27] Distributed Continual Learning With CoCoA in High-Dimensional Linear Regression
Hellkvist, Martin
Ozcelikkale, Ayca
Ahlen, Anders
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 1015 - 1031
[28] Scalable Algorithms for Learning High-Dimensional Linear Mixed Models
Tan, Zilong
Roche, Kimberly
Zhou, Xiang
Mukherjee, Sayan
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 259 - 268
[29] ROBUST NEAREST-NEIGHBOR METHODS FOR CLASSIFYING HIGH-DIMENSIONAL DATA
Chan, Yao-Ban
Hall, Peter
ANNALS OF STATISTICS, 2009, 37 (6A): : 3186 - 3203
[30] A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem
Zekic-Susac, Marijana
Pfeifer, Sanja
Sarlija, Natasa
BUSINESS SYSTEMS RESEARCH JOURNAL, 2014, 5 (03): : 82 - 96

← 1 2 3 4 5 →