Correcting the bias in least squares regression with volume-rescaled sampling
被引:0
|
作者:
Derezinski, Michal
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
Derezinski, Michal
[1
]
Warmuth, Manfred K.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Santa Cruz, Dept Comp Sci, Santa Cruz, CA 95064 USAUniv Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
Warmuth, Manfred K.
[2
]
Hsu, Daniel
论文数: 0引用数: 0
h-index: 0
机构:
Columbia Univ, Dept Comp Sci, New York, NY 10027 USAUniv Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
Hsu, Daniel
[3
]
机构:
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Univ Calif Santa Cruz, Dept Comp Sci, Santa Cruz, CA 95064 USA
[3] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
来源:
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89
|
2019年
/
89卷
关键词:
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Consider linear regression where the examples are generated by an unknown distribution on R-d x R. Without any assumptions on the noise, the linear least squares solution for any i.i.d. sample will typically be biased w.r.t. the least squares optimum over the entire distribution. However, we show that if an i.i.d. sample of any size k is augmented by a certain small additional sample, then the solution of the combined sample becomes unbiased. We show this when the additional sample consists of d points drawn jointly according to the input distribution that is rescaled by the squared volume spanned by the points. Furthermore, we propose algorithms to sample from this volume-rescaled distribution when the data distribution is only known through an i.i.d sample.
机构:
Univ Paris Est, LIGM, F-77455 Marne La Vallee, France
CNRS Ecole Normale Super INRIA, LIENS, Sierra UMR 8548, F-75214 Paris 13, FranceUniv Paris Est, LIGM, F-77455 Marne La Vallee, France
Audibert, Jean-Yves
Catoni, Olivier
论文数: 0引用数: 0
h-index: 0
机构:
Ecole Normale Super, CNRS UMR 8553, Dept Math & Applicat, F-75230 Paris 05, France
INRIA Paris, Rocquencourt, CLASSIC Team, Paris, FranceUniv Paris Est, LIGM, F-77455 Marne La Vallee, France
Catoni, Olivier
ANNALS OF STATISTICS,
2011,
39
(05):
: 2766
-
2794