USING TAYLOR-APPROXIMATED GRADIENTS TO IMPROVE THE FRANK-WOLFE METHOD FOR EMPIRICAL RISK MINIMIZATION

被引:0
|
作者
Xiong, Zikai [1 ]
Freund, Robert M. [2 ]
机构
[1] MIT, Operat Res Ctr, Cambridge, MA 02139 USA
[2] MIT, Sloan Sch Management, Cambridge, MA 02139 USA
关键词
Frank-Wolfe; linear minimization oracle; empirical risk minimization; linear prediction; computational complexity; convex optimization; CONVEX;
D O I
10.1137/22M1519286
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The Frank-Wolfe method has become increasingly useful in statistical and machine learning applications due to the structure-inducing properties of the iterates and especially in settings where linear minimization over the feasible set is more computationally efficient than projection. In the setting of empirical risk minimization-one of the fundamental optimization problems in statistical and machine learning-the computational effectiveness of Frank-Wolfe methods typically grows linearly in the number of data observations n . This is in stark contrast to the case for typical stochastic projection methods. In order to reduce this dependence on n , we look to second-order smoothness of typical smooth loss functions (least squares loss and logistic loss, for example), and we propose amending the Frank-Wolfe method with Taylor series-approximated gradients, including variants for both deterministic and stochastic settings. Compared with current state-of-the-art methods in the regime where the optimality tolerance epsilon is sufficiently small, our methods are able to simultaneously reduce the dependence on large n while obtaining optimal convergence rates of Frank-Wolfe methods in both convex and nonconvex settings. We also propose a novel adaptive step-size approach for which we have computational guarantees. Finally, we present computational experiments which show that our methods exhibit very significant speedups over existing methods on real-world datasets for both convex and nonconvex binary classification problems.
引用
收藏
页码:2503 / 2534
页数:32
相关论文
共 3 条
  • [1] A Newton Frank-Wolfe method for constrained self-concordant minimization
    Liu, Deyi
    Cevher, Volkan
    Tran-Dinh, Quoc
    JOURNAL OF GLOBAL OPTIMIZATION, 2022, 83 (02) : 273 - 299
  • [2] Speeding up the Frank-Wolfe method using the Orthogonal Jacobi polynomials
    Francis, Robin
    Chepuri, Sundeep Prabhakar
    2022 56TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2022, : 1081 - 1085
  • [3] Semi-supervised empirical risk minimization: Using unlabeled data to improve prediction
    Yuval, Oren
    Rosset, Saharon
    ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (01): : 1434 - 1460