Large Scale Empirical Risk Minimization via Truncated Adaptive Newton Method

被引：0

作者：

Eisen, Mark ^{[1
]}

Mokhtari, Aryan ^{[2
]}

Ribeiro, Alejandro ^{[1
]}

机构：

[1] Univ Penn, Philadelphia, PA 19104 USA

[2] MIT, Cambridge, MA 02139 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84 | 2018年 / 84卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most second order methods are inapplicable to large scale empirical risk minimization (ERM) problems because both, the number of samples N and number of parameters p are large. Large N makes it costly to evaluate Hessians and large p makes it costly to invert Hessians. This paper propose a novel adaptive sample size second-order method, which reduces the cost of computing the Hessian by solving a sequence of ERM problems corresponding to a subset of samples and lowers the cost of computing the Hessian inverse using a truncated eigenvalue decomposition. Although the sample size is grown at a geometric rate, it is shown that it is sufficient to run a single iteration in each growth stage to track the optimal classifier to within its statistical accuracy. This results in convergence to the optimal classifier associated with the whole set in a number of iterations that scales with log(N). The use of a truncated eigenvalue decomposition result in the cost of each iteration being of order p(2). Theoretical performance gains manifest in practical implementations.

引用

页数：9

共 50 条

[31] NEWTON-TYPE MINIMIZATION VIA THE LANCZOS METHOD.
Nash, Stephen G.
1600, (21):
[32] A METHOD OF MINIMIZATION OF THE EMPIRICAL RISK IN IDENTIFICATION PROBLEMS
TSYBAKOV, AB
AUTOMATION AND REMOTE CONTROL, 1981, 42 (09) : 1196 - 1203
[33] Particle filtering methods for stochastic optimization with application to large-scale empirical risk minimization
Liu, Bin
KNOWLEDGE-BASED SYSTEMS, 2020, 193
[34] Nonconvex Truncated Nuclear Norm Minimization Based on Adaptive Bisection Method
Su, Xinhua
Wang, Yilun
Kang, Xuejing
Tao, Ran
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (11) : 3159 - 3172
[35] A nonmonotone truncated Newton-Krylov method exploiting negative curvature directions, for large scale unconstrained optimization
Fasano, Giovanni
Lucidi, Stefano
OPTIMIZATION LETTERS, 2009, 3 (04) : 521 - 535
[36] A Truncated Nonmonotone Gauss-Newton Method for Large-Scale Nonlinear Least-Squares Problems
G. Fasano
F. Lampariello
M. Sciandrone
Computational Optimization and Applications, 2006, 34 : 343 - 358
[37] A truncated nonmonotone Gauss-Newton method for large-scale nonlinear least-squares problems
Fasano, G
Lampariello, F
Sciandrone, M
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2006, 34 (03) : 343 - 358
[38] Curvilinear stabilization techniques for truncated Newton methods in large scale unconstrained optimization
Lucidi, S
Rochetich, F
Roma, M
SIAM JOURNAL ON OPTIMIZATION, 1998, 8 (04) : 916 - 939
[39] Numerical experiences with new truncated Newton methods in large scale unconstrained optimization
Lucidi, S
Roma, M
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 1997, 7 (01) : 71 - 87
[40] RFN: A Random-Feature Based Newton Method for Empirical Risk Minimization in Reproducing Kernel Hilbert Spaces
Chang, Ting-Jui
Shahrampour, Shahin
IEEE Transactions on Signal Processing, 2022, 70 : 5308 - 5319

← 1 2 3 4 5 →