Empirical loss landscape analysis in deep learning: A survey

被引:0
|
作者
Liang R. [1 ,3 ]
Liu B. [1 ,2 ]
Sun Y. [4 ]
机构
[1] Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing
[2] National Center for Mathematics and Interdisciplinary Sciences, Beijing
[3] School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing
[4] School of Mathematical Sciences, Nanjing Normal University, Nanjing
关键词
deep learning; empirical loss function; landscape analysis;
D O I
10.12011/SETP2021-2266
中图分类号
学科分类号
摘要
Empirical loss landscape analysis is critical to reveal reasons why deep networks are easily optimizable, and has aroused considerable interests in machine learning and mathematical optimization. The main goal of this investigation is to provide a comprehensive state-of-the-art review of the empirical loss landscape analysis, including number and spatial distribution of local minima, connectivity between global optima, global optimality of critical points, convergence of gradient descent, and visualization of empirical loss landscape. This review also identifies challenges and highlights opportunities for future work. © 2023 Systems Engineering Society of China. All rights reserved.
引用
收藏
页码:813 / 823
页数:10
相关论文
共 53 条
  • [1] Goodfellow I, Bengio Y, Courville A., Deep learning, (2016)
  • [2] Schmidhuber J., Deep learning in neural networks: An overview[J], Neural Networks, 61, pp. 85-117, (2015)
  • [3] Hu Y, Luo D Y, Hua K, Et al., Overview on deep learning [J], CAAI Transactions on Intelligent Systems, 14, pp. 1-19, (2019)
  • [4] Liu B, Chi W, Li X, Et al., Evolving the pulmonary nodules diagnosis from classical approaches to deep learning-aided decision support: Three decades’ development course and future prospect[J], Journal of Cancer Research and Clinical Oncology, 146, 1, pp. 153-185, (2020)
  • [5] Yang Y, Feng X, Chi W, Et al., Deep learning aided decision support for pulmonary nodules diagnosing: A review[J], Journal of Thoracic Disease, 10, 7, (2018)
  • [6] Hochreiter S., Untersuchungen zu dynamischen neuronalen Netzen, (1991)
  • [7] Fukushima K, Miyake S, Ito T., Neocognitron: A neural network model for a mechanism of visual pattern recognition[J], IEEE Transactions on Systems, Man, and Cybernetics, 5, pp. 826-834, (1983)
  • [8] Li Z M, Fang Y., Industry asset allocation model based on LSTM neural network[J], Systems Engineering — Theory & Practice, 41, 8, pp. 2045-2055, (2021)
  • [9] Wu J J, Liu G N, Wang J Y, Et al., Data intelligence: Trends and challenges[J], Systems Engineering — Theory & Practice, 40, 8, pp. 2116-2149, (2020)
  • [10] Zhang C, Bengio S, Hardt M, Et al., Understanding deep learning requires rethinking generalization, (2016)