Large Scale Empirical Risk Minimization via Truncated Adaptive Newton Method

被引:0
|
作者
Eisen, Mark [1 ]
Mokhtari, Aryan [2 ]
Ribeiro, Alejandro [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] MIT, Cambridge, MA 02139 USA
来源
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84 | 2018年 / 84卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most second order methods are inapplicable to large scale empirical risk minimization (ERM) problems because both, the number of samples N and number of parameters p are large. Large N makes it costly to evaluate Hessians and large p makes it costly to invert Hessians. This paper propose a novel adaptive sample size second-order method, which reduces the cost of computing the Hessian by solving a sequence of ERM problems corresponding to a subset of samples and lowers the cost of computing the Hessian inverse using a truncated eigenvalue decomposition. Although the sample size is grown at a geometric rate, it is shown that it is sufficient to run a single iteration in each growth stage to track the optimal classifier to within its statistical accuracy. This results in convergence to the optimal classifier associated with the whole set in a number of iterations that scales with log(N). The use of a truncated eigenvalue decomposition result in the cost of each iteration being of order p(2). Theoretical performance gains manifest in practical implementations.
引用
收藏
页数:9
相关论文
共 50 条