Sparsity in penalized empirical risk minimization

被引:56
|
作者
Koltchinskii, Vladimir [1 ]
机构
[1] Georgia Inst Technol, Sch Math, Atlanta, GA 30332 USA
关键词
Empirical risk; Penalized empirical risk; l(p)-penalty; Sparsity; Oracle inequalities; ORACLE INEQUALITIES; MODEL SELECTION; COMPLEXITIES; RECOVERY;
D O I
10.1214/07-AIHP146
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Let (X, Y) be a random couple in S x T with unknown distribution P. Let (X-1, Y-1),..., (X-n, Y-n) be i.i.d. copies of (X, Y). P-n being their emrical distribution. Let h(1),..., h(N) : S bar right arrow [- 1, 1] be a dictionary consisting of N functions. For lambda is an element of R-N. denote f(lambda) := Sigma(N)(j=1); lambda(j)h(j). Let l: T x R bar right aroow R be a given loss function, which is convex with respect to the second variable. Denote (l center dot f)(x, y) := l(y; f(x)). We study the following penalized empirical risk minimization problem (lambda) over cap (epsilon) := (lambda is an element of RN)argmin[P-n(l center dot f(lambda)) + epsilon parallel to lambda parallel to(p)(lp)], which is an empirical version of the problem (lambda) over cap (epsilon) :=argmin[P-n(l center dot f(lambda)) + epsilon parallel to lambda parallel to(p)(lp)] (here epsilon >= 0 is a regularization parameter; lambda(0) corresponds to epsilon = 0). A number of regression and classification problems fit this general framework. We are interested in the case when p >= 1, but it is close enough to 1 (so that p - 1 is of the order 1/log N, or smaller). We show that the "sparsity" of lambda(epsilon) implies the "sparsity" of (lambda) over cap (epsilon) and study the impact of "sparsity" on bounding the excess risk P(l center dot f((lambda) over cap epsilon)) - P(l center dot f(lambda 0)) of solutions of empirical risk minimization problems.
引用
收藏
页码:7 / 57
页数:51
相关论文
共 50 条
  • [31] Empirical risk minimization for support vector classifiers
    Pérez-Cruz, F
    Navia-Vázquez, A
    Figueiras-Vidal, AR
    Artés-Rodríguez, A
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2003, 14 (02): : 296 - 303
  • [32] On the Variance, Admissibility, and Stability of Empirical Risk Minimization
    Kur, Gil
    Putterman, Eli
    Rakhlin, Alexnader
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [33] Performance of empirical risk minimization in linear aggregation
    Lecue, Guillaume
    Mendelson, Shahar
    BERNOULLI, 2016, 22 (03) : 1520 - 1534
  • [34] Empirical Risk Minimization under Random Censorship
    Ausset, Guillaume
    Clémençon, Stephan
    Portier, François
    Journal of Machine Learning Research, 2022, 23
  • [35] Ranking and scoring using empirical risk minimization
    Clémençon, S
    Lugosi, G
    Vayatis, N
    LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 1 - 15
  • [36] Empirical Risk Minimization Under Fairness Constraints
    Donini, Michele
    Oneto, Luca
    Ben-David, Shai
    Shawe-Taylor, John
    Pontil, Massimiliano
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [37] Empirical Risk Minimization under Random Censorship
    Ausset, Guillaume
    Clemencon, Stephan
    Portier, Francois
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23 : 1 - 59
  • [38] A METHOD OF MINIMIZATION OF THE EMPIRICAL RISK IN IDENTIFICATION PROBLEMS
    TSYBAKOV, AB
    AUTOMATION AND REMOTE CONTROL, 1981, 42 (09) : 1196 - 1203
  • [39] On the Stability of Empirical Risk Minimization in the Presence of Multiple Risk Minimizers
    Rubinstein, Benjamin I. P.
    Simma, Aleksandr
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2012, 58 (07) : 4160 - 4163
  • [40] Blind tracking using sparsity penalized multidimensional scaling
    Rangarajan, Raghuram
    Raich, Raviv
    Hero, Alfred O., III
    2007 IEEE/SP 14TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 670 - 674