SALR: Sharpness-Aware Learning Rate Scheduler for Improved Generalization

被引:0
|
作者
Yue, Xubo [1 ]
Nouiehed, Maher [2 ]
Al Kontar, Raed [1 ]
机构
[1] Univ Michigan, Dept Ind & Operat Engn, Ann Arbor, MI 48109 USA
[2] Amer Univ Beirut, Dept Ind Engn & Management, Beirut 1072020, Lebanon
基金
美国国家科学基金会;
关键词
Schedules; Deep learning; Neural networks; Convergence; Bayes methods; Training; Stochastic processes; generalization; learning rate schedule; sharpness;
D O I
10.1109/TNNLS.2023.3263393
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In an effort to improve generalization in deep learning and automate the process of learning rate scheduling, we propose SALR: a sharpness-aware learning rate update technique designed to recover flat minimizers. Our method dynamically updates the learning rate of gradient-based optimizers based on the local sharpness of the loss function. This allows optimizers to automatically increase learning rates at sharp valleys to increase the chance of escaping them. We demonstrate the effectiveness of SALR when adopted by various algorithms over a broad range of networks. Our experiments indicate that SALR improves generalization, converges faster, and drives solutions to significantly flatter regions.
引用
收藏
页码:12518 / 12527
页数:10
相关论文
共 50 条
  • [31] Bayesian Sharpness-Aware Prompt Tuning for Cross-Domain Few-shot Learning
    Fan, Shuo
    Zhuang, Liansheng
    Li, Aodi
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [32] ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
    Kwon, Jungmin
    Kim, Jeongseop
    Park, Hyunseo
    Choi, In Kwon
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [33] SaME: Sharpness-aware Matching Ensemble for Robust Palmprint Recognition
    Liang, Xu
    Li, Zhaoqun
    Fan, Dandan
    Yang, Jinyang
    Lu, Guangming
    Zhang, David
    PATTERN RECOGNITION, ACPR 2021, PT I, 2022, 13188 : 488 - 500
  • [34] CR-SAM: Curvature Regularized Sharpness-Aware Minimization
    Wu, Tao
    Luo, Tie
    Wunsch, Donald C., II
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6144 - 6152
  • [35] Sharpness-Aware Minimization Leads to Low-Rank Features
    Andriushchenko, Maksym
    Bahri, Dara
    Mobahi, Hossein
    Flammarion, Nicolas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [36] Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
    Mi, Peng
    Shen, Li
    Ren, Tianhe
    Zhou, Yiyi
    Sun, Xiaoshuai
    Ji, Rongrong
    Tao, Dacheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [37] Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
    Chen, Zixiang
    Zhang, Junkai
    Kou, Yiwen
    Chen, Xiangning
    Hsieh, Cho-Jui
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [38] Fast sharpness-aware training for periodic time series classification and forecasting
    Park, Jinseong
    Kim, Hoki
    Choi, Yujin
    Lee, Woojin
    Lee, Jaewook
    APPLIED SOFT COMPUTING, 2023, 144
  • [39] A Retinal Vessel Segmentation Method Based on the Sharpness-Aware Minimization Model
    Mariam, Iqra
    Xue, Xiaorong
    Gadson, Kaleb
    SENSORS, 2024, 24 (13)
  • [40] TOWARDS BOOSTING BLACK-BOX ATTACK VIA SHARPNESS-AWARE
    Zhang, Yukun
    Yuan, Shengming
    Song, Jingkuan
    Zhou, Yixuan
    Zhang, Lin
    He, Yulan
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 294 - 299