A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization

被引:0
|
作者
Minghan Yang
Andre Milzarek
Zaiwen Wen
Tong Zhang
机构
[1] Peking University,Beijing International Center for Mathematical Research, BICMR
[2] The Chinese University of Hong Kong - Shenzhen,School of Data Science SDS
[3] Shenzhen Research Institute of Big Data,Center for Data Science
[4] SRIBD,undefined
[5] Shenzhen Institute of Artificial Intelligence and Robotics for Society,undefined
[6] AIRS,undefined
[7] Peking University,undefined
[8] Hong Kong University of Science and Technology,undefined
来源
Mathematical Programming | 2022年 / 194卷
关键词
Nonsmooth stochastic optimization; Stochastic approximation; Global convergence; Stochastic higher order method; Stochastic quasi-Newton scheme; 90C06; 90C15; 90C26; 90C53;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a novel stochastic extra-step quasi-Newton method is developed to solve a class of nonsmooth nonconvex composite optimization problems. We assume that the gradient of the smooth part of the objective function can only be approximated by stochastic oracles. The proposed method combines general stochastic higher order steps derived from an underlying proximal type fixed-point equation with additional stochastic proximal gradient steps to guarantee convergence. Based on suitable bounds on the step sizes, we establish global convergence to stationary points in expectation and an extension of the approach using variance reduction techniques is discussed. Motivated by large-scale and big data applications, we investigate a stochastic coordinate-type quasi-Newton scheme that allows to generate cheap and tractable stochastic higher order directions. Finally, numerical results on large-scale logistic regression and deep learning problems show that our proposed algorithm compares favorably with other state-of-the-art methods.
引用
收藏
页码:257 / 303
页数:46
相关论文
共 50 条
  • [1] A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization
    Yang, Minghan
    Milzarek, Andre
    Wen, Zaiwen
    Zhang, Tong
    [J]. MATHEMATICAL PROGRAMMING, 2022, 194 (1-2) : 257 - 303
  • [2] STOCHASTIC QUASI-NEWTON METHOD FOR NONCONVEX STOCHASTIC OPTIMIZATION
    Wang, Xiao
    Ma, Shiqian
    Goldfarb, Donald
    Liu, Wei
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2017, 27 (02) : 927 - 956
  • [3] A BUNDLE-TYPE QUASI-NEWTON METHOD FOR NONCONVEX NONSMOOTH OPTIMIZATION
    Tang, Chunming
    Chent, Huangyue
    Jian, Jinbao
    Liu, Shuai
    [J]. PACIFIC JOURNAL OF OPTIMIZATION, 2022, 18 (02): : 367 - 393
  • [4] A quasi-Newton algorithm for nonconvex, nonsmooth optimization with global convergence guarantees
    Curtis F.E.
    Que X.
    [J]. Mathematical Programming Computation, 2015, 7 (4) : 399 - 428
  • [5] A Stochastic Quasi-Newton Method for Large-Scale Nonconvex Optimization With Applications
    Chen, Huiming
    Wu, Ho-Chun
    Chan, Shing-Chow
    Lam, Wong-Hing
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4776 - 4790
  • [6] A STOCHASTIC SEMISMOOTH NEWTON METHOD FOR NONSMOOTH NONCONVEX OPTIMIZATION
    Milzarek, Andre
    Xiao, Xiantao
    Cen, Shicong
    Wen, Zaiwen
    Ulbrich, Michael
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2019, 29 (04) : 2916 - 2948
  • [7] A Variable Sample-size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization
    Jalilzadeh, Afrooz
    Nedic, Angelia
    Shanbhag, Uday V.
    Yousefian, Farzad
    [J]. 2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 4097 - 4102
  • [8] A Variable Sample-Size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization
    Jalilzadeh, Afrooz
    Nedic, Angelia
    Shanbhag, Uday, V
    Yousefian, Farzad
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2022, 47 (01) : 690 - 719
  • [9] A DIRECT SEARCH QUASI-NEWTON METHOD FOR NONSMOOTH UNCONSTRAINED OPTIMIZATION
    Price, C. J.
    [J]. ANZIAM JOURNAL, 2017, 59 (02): : 215 - 231
  • [10] Nonsmooth optimization via quasi-Newton methods
    Adrian S. Lewis
    Michael L. Overton
    [J]. Mathematical Programming, 2013, 141 : 135 - 163