Incremental PID Controller-Based Learning Rate Scheduler for Stochastic Gradient Descent

被引：0

作者：

Wang, Zenghui ^{[1
]}

Zhang, Jun ^{[2
]}

机构：

[1] Anhui Univ, Sch Elect Engn & Automation, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China

[2] Anhui Univ, Sch Artificial Intelligence, Hefei 230601, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Training; PD control; PI control; Feedback control; Convergence; Oscillators; Optimization; incremental proportional-integral-derivative (PID) controller; learning rate scheduler; stochastic gradient descent (SGD);

D O I：

10.1109/TNNLS.2022.3213677

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As we all know, the learning rate plays a vital role in deep neural network (DNN) training. This study introduces an incremental proportional-integral-derivative (PID) controller widely used in automatic control as a learning rate scheduler for stochastic gradient descent (SGD). To automatically calculate the current learning rate, we utilize feedback control to determine the relationship between training losses and learning rates, named incremental PID learning rates, which include PID-Base and PID-Warmup. The new schedulers reduce the dependence on the initial learning rate and achieve higher accuracy. Compared with multistep learning rates (MSLR), cyclical learning rates (CLR), and SGD with warm restarts (SGDR), incremental PID learning rates based on feedback control obtain higher accuracy on CIFAR-10, CIFAR-100, and Tiny-ImageNet-200. We believe that our methods can improve the performance of SGD.

引用

下载

页码：7060 / 7071

页数：12

共 50 条

[1] PID controller-based adaptive gradient optimizer for deep neural networks
Dai, Mingjun
Zhang, Zelong
Lai, Xiong
Lin, Xiaohui
Wang, Hui
IET CONTROL THEORY AND APPLICATIONS, 2023, 17 (15): : 2032 - 2037
[2] PID Controller-Based Stochastic Optimization Acceleration for Deep Neural Networks
Wang, Haoqian
Luo, Yi
An, Wangpeng
Sun, Qingyun
Xu, Jun
Zhang, Lei
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5079 - 5091
[3] Stochastic Gradient Descent with Polyak's Learning Rate
Prazeres, Mariana
Oberman, Adam M.
JOURNAL OF SCIENTIFIC COMPUTING, 2021, 89 (01)
[4] Stochastic Gradient Descent with Polyak’s Learning Rate
Mariana Prazeres
Adam M. Oberman
Journal of Scientific Computing, 2021, 89
[5] Convergence diagnostics for stochastic gradient descent with constant learning rate
Chee, Jerry
Toulis, Panos
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[6] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Liu, Kangqiao
Liu Ziyin
Ueda, Masahito
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[7] GLR: Gradient-Based Learning Rate Scheduler
Spatafora, Maria Ausilia Napoli
Ortis, Alessandro
Battiato, Sebastiano
IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I, 2023, 14233 : 269 - 281
[8] Stochastic incremental gradient descent for estimation in sensor networks
Ram, S. Sundhar
Nedic, A.
Veeravalli, V. V.
CONFERENCE RECORD OF THE FORTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1-5, 2007, : 582 - 586
[9] Angle based dynamic learning rate for gradient descent
Mishra, Neel
Kumar, Pawan
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[10] Adaptive Sampling for Incremental Optimization Using Stochastic Gradient Descent
Papa, Guillaume
Bianchi, Pascal
Clemencon, Stephan
ALGORITHMIC LEARNING THEORY, ALT 2015, 2015, 9355 : 317 - 331

← 1 2 3 4 5 →