Optimization for Deep Learning: An Overview

被引:0
|
作者
Ruo-Yu Sun
机构
[1] University of Illinois at Urbana-Champaign,Department of Industrial and Enterprise Systems Engineering, and affiliated to Coordinated Science Laboratory
[2] University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering
关键词
Deep learning; Non-convex optimization; Neural networks; Convergence; Landscape; 90C30; 68Q32;
D O I
暂无
中图分类号
学科分类号
摘要
Optimization is a critical component in deep learning. We think optimization for neural networks is an interesting topic for theoretical research due to various reasons. First, its tractability despite non-convexity is an intriguing question and may greatly expand our understanding of tractable problems. Second, classical optimization theory is far from enough to explain many phenomena. Therefore, we would like to understand the challenges and opportunities from a theoretical perspective and review the existing research in this field. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum and then discuss practical solutions including careful initialization, normalization methods and skip connections. Second, we review generic optimization methods used in training neural networks, such as stochastic gradient descent and adaptive gradient methods, and existing theoretical results. Third, we review existing research on the global issues of neural network training, including results on global landscape, mode connectivity, lottery ticket hypothesis and neural tangent kernel.
引用
收藏
页码:249 / 294
页数:45
相关论文
共 50 条
  • [1] Optimization for Deep Learning: An Overview
    Sun, Ruo-Yu
    [J]. JOURNAL OF THE OPERATIONS RESEARCH SOCIETY OF CHINA, 2020, 8 (02) : 249 - 294
  • [2] Deep Learning: An Overview
    Farsal, Wissal
    Anter, Samir
    Ramdani, Mohammed
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA'18), 2018,
  • [3] Overview of Deep Learning
    Du, Xuedan
    Cai, Yinghao
    Wang, Shuo
    Zhang, Leijie
    [J]. 2016 31ST YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2016, : 159 - 164
  • [4] Deep Reinforcement Learning: An Overview
    Mousavi, Seyed Sajad
    Schukat, Michael
    Howley, Enda
    [J]. PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 2, 2018, 16 : 426 - 440
  • [5] Lightweight Deep Learning: An Overview
    Wang, Ching-Hao
    Huang, Kang-Yang
    Yao, Yi
    Chen, Jun-Cheng
    Shuai, Hong-Han
    Cheng, Wen-Huang
    [J]. IEEE CONSUMER ELECTRONICS MAGAZINE, 2024, 13 (04) : 51 - 64
  • [6] Overview of Deep Learning Research
    Liu, Yanmei
    Chen, Yuda
    [J]. PROCEEDINGS OF THE 5TH ANNUAL INTERNATIONAL CONFERENCE ON SOCIAL SCIENCE AND CONTEMPORARY HUMANITY DEVELOPMENT (SSCHD 2019), 2019, 376 : 719 - 725
  • [7] An overview of deep learning techniques
    Vogt, Michael
    [J]. AT-AUTOMATISIERUNGSTECHNIK, 2018, 66 (09) : 690 - 703
  • [8] A Selective Overview of Deep Learning
    Fan, Jianqing
    Ma, Cong
    Zhong, Yiqiao
    [J]. STATISTICAL SCIENCE, 2021, 36 (02) : 264 - 290
  • [9] Deep learning on image denoising: An overview
    Tian, Chunwei
    Fei, Lunke
    Zheng, Wenxian
    Xu, Yong
    Zuo, Wangmeng
    Lin, Chia-Wen
    [J]. NEURAL NETWORKS, 2020, 131 : 251 - 275
  • [10] An overview of deep learning in the field of dentistry
    Hwang, Jae-Joon
    Jung, Yun-Hoa
    Cho, Bong-Hae
    Heo, Min-Suk
    [J]. IMAGING SCIENCE IN DENTISTRY, 2019, 49 (01) : 1 - 7