Cyclical Learning Rates for Training Neural Networks

被引:1498
|
作者
Smith, Leslie N. [1 ]
机构
[1] US Naval, Res Lab, Code 5514 4555 Overlook Ave SW, Washington, DC 20375 USA
关键词
D O I
10.1109/WACV.2017.58
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is known that the learning rate is the most important hyper-parameter to tune for training deep neural networks. This paper describes a new method for setting the learning rate, named cyclical learning rates, which practically eliminates the need to experimentally find the best values and schedule for the global learning rates. Instead of monotonically decreasing the learning rate, this method lets the learning rate cyclically vary between reasonable boundary values. Training with cyclical learning rates instead of fixed values achieves improved classification accuracy without a need to tune and often in fewer iterations. This paper also describes a simple way to estimate "reasonable bounds" - linearly increasing the learning rate of the network for a few epochs. In addition, cyclical learning rates are demonstrated on the CIFAR-10 and CIFAR-100 datasets with ResNets, Stochastic Depth networks, and DenseNets, and the ImageNet dataset with the AlexNet and GoogLeNet architectures. These are practical tools for everyone who trains neural networks.
引用
收藏
页码:464 / 472
页数:9
相关论文
共 50 条
  • [1] LLR: Learning learning rates by LSTM for training neural networks
    Yu, Changyong
    Qi, Xin
    Ma, Haitao
    He, Xin
    Wang, Cuirong
    Zhao, Yuhai
    [J]. NEUROCOMPUTING, 2020, 394 : 41 - 50
  • [2] Rates of learning in gradient and genetic training of recurrent neural networks
    Riaza, R
    Zufiria, PJ
    [J]. ARTIFICIAL NEURAL NETS AND GENETIC ALGORITHMS, 1999, : 95 - 99
  • [3] RALR: Random Amplify Learning Rates for Training Neural Networks
    Deng, Jiali
    Gong, Haigang
    Liu, Minghui
    Xie, Tianshu
    Cheng, Xuan
    Wang, Xiaomin
    Liu, Ming
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (01):
  • [4] Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks
    Iiduka, Hideaki
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13250 - 13261
  • [5] Multilingual Deep Neural Network Training using Cyclical Learning Rate
    Kirkedal, Andreas
    Kim, Yeon-Jun
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2933 - 2937
  • [6] The adaptive learning rates of extended Kalman filter based training algorithm for wavelet neural networks
    Kim, Kyoung Joo
    Park, Jin Bae
    Choi, Yoon Ho
    [J]. MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 327 - +
  • [7] Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
    Smith, Leslie N.
    Topin, Nicholay
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS, 2019, 11006
  • [8] Cyclical Pruning for Sparse Neural Networks
    Srinivas, Suraj
    Kuzmin, Andrey
    Nagel, Markus
    van Baalen, Mart
    Skliar, Andrii
    Blankevoort, Tijmen
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2761 - 2770
  • [9] Neural Structured Learning: Training Neural Networks with Structured Signals
    Gopalan, Arjun
    Juan, Da-Cheng
    Magalhaes, Cesar Ilharco
    Ferng, Chun-Sung
    Heydon, Allan
    Lu, Chun-Ta
    Pham, Philip
    Yu, George
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3501 - 3502
  • [10] Neural Graph Learning: Training Neural Networks Using Graphs
    Bui, Thang D.
    Ravi, Sujith
    Ramavajjala, Vivek
    [J]. WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, : 64 - 71