Incorporating Derivative-Free Convexity with Trigonometric Simplex Designs for Learning-Rate Estimation of Stochastic Gradient-Descent Method

被引:0
|
作者
Tokgoz, Emre [1 ]
Musafer, Hassan [2 ]
Faezipour, Miad [3 ]
Mahmood, Ausif [2 ]
机构
[1] Quinnipiac Univ, Sch Comp & Engn, Hamden, CT 06518 USA
[2] Univ Bridgeport, Dept Comp Sci & Engn, Bridgeport, CT 06604 USA
[3] Purdue Univ, Sch Engn Technol Elect & Comp Engn Technol, W Lafayette, IN 47907 USA
关键词
derivative-free convexity; trigonometric simplex design; stochastic gradient descent; adaptive learning rate; deep neural network;
D O I
10.3390/electronics12020419
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a novel mathematical theory of adaptation to convexity of loss functions based on the definition of the condense-discrete convexity (CDC) method. The developed theory is considered to be of immense value to stochastic settings and is used for developing the well-known stochastic gradient-descent (SGD) method. The successful contribution of change of the convexity definition impacts the exploration of the learning-rate scheduler used in the SGD method and therefore impacts the convergence rate of the solution that is used for measuring the effectiveness of deep networks. In our development of methodology, the convexity method CDC and learning rate are directly related to each other through the difference operator. In addition, we have incorporated the developed theory of adaptation with trigonometric simplex (TS) designs to explore different learning rate schedules for the weight and bias parameters within the network. Experiments confirm that by using the new definition of convexity to explore learning rate schedules, the optimization is more effective in practice and has a strong effect on the training of the deep neural network.
引用
收藏
页数:9
相关论文
共 3 条
  • [1] A Stochastic Derivative-Free Optimization Method with Importance Sampling: Theory and Learning to Control
    Bibi, Adel
    Bergou, El Houcine
    Sener, Ozan
    Ghanem, Bernard
    Richtarik, Peter
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3275 - 3282
  • [2] Parameter estimation of various Hodgkin-Huxley-type neuronal models using a gradient-descent learning method
    Doi, SJ
    Onoda, Y
    Kumagai, S
    SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5, 2002, : 1685 - 1688
  • [3] Distributed Derivative-Free Learning Method for Stochastic Optimization Over a Network With Sparse Activity
    Li, Wenjie
    Assaad, Mohamad
    Zheng, Shiqi
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (05) : 2221 - 2236