Stochastic Generalized Gradient Methods for Training Nonconvex Nonsmooth Neural Networks

被引:0
|
作者
V. I. Norkin
机构
[1] V. M. Glushkov Institute of Cybernetics,
[2] National Academy of Sciences of Ukraine,undefined
[3] National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”,undefined
来源
关键词
machine learning; deep learning; multilayer neural networks; nonsmooth nonconvex optimization; stochastic optimization; stochastic generalized gradient;
D O I
暂无
中图分类号
学科分类号
摘要
The paper observes a similarity between the stochastic optimal control of discrete dynamical systems and the learning multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. The machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, the so-called generalized-differentiable functions are used. The backpropagation method for calculating stochastic generalized gradients of the learning quality functional for such systems is substantiated basing on Hamilton–Pontryagin formalism. Stochastic generalized gradient learning algorithms are extended for training nonconvex nonsmooth neural networks. The performance of a stochastic generalized gradient algorithm is illustrated by the linear multiclass classification problem.
引用
收藏
页码:714 / 729
页数:15
相关论文
共 50 条
  • [1] Stochastic Generalized Gradient Methods for Training Nonconvex Nonsmooth Neural Networks
    Norkin, V. I.
    [J]. CYBERNETICS AND SYSTEMS ANALYSIS, 2021, 57 (05) : 714 - 729
  • [2] Stochastic generalized gradient method for nonconvex nonsmooth stochastic optimization
    Yu. M. Ermol'ev
    V. I. Norkin
    [J]. Cybernetics and Systems Analysis, 1998, 34 : 196 - 215
  • [3] Stochastic generalized gradient method for nonconvex nonsmooth stochastic optimization
    Ermol'ev, YM
    Norkin, VI
    [J]. CYBERNETICS AND SYSTEMS ANALYSIS, 1998, 34 (02) : 196 - 215
  • [4] Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization∗
    Li, Zhize
    Li, Jian
    [J]. Journal of Machine Learning Research, 2022, 23
  • [5] Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization
    Li, Zhize
    Li, Jian
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [6] Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization
    Li, Zhize
    Li, Jian
    [J]. arXiv, 2022,
  • [7] Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization
    Lin, Tianyi
    Zheng, Zeyu
    Jordan, Michael I.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [8] Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth Optimization
    Huang, Feihu
    Gu, Bin
    Huo, Zhouyuan
    Chen, Songcan
    Huang, Heng
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1503 - 1510
  • [9] VARIABLE METRIC PROXIMAL STOCHASTIC VARIANCE REDUCED GRADIENT METHODS FOR NONCONVEX NONSMOOTH OPTIMIZATION
    Yu, Tengteng
    Liu, Xin-wei
    Dai, Yu-hong
    Sun, J. I. E.
    [J]. JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2022, 18 (04) : 2611 - 2631
  • [10] Generalized gradient projection neural networks for nonsmooth optimization problems
    Li GuoCheng
    Song ShiJi
    Wu Cheng
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2010, 53 (05) : 990 - 1005