GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks

被引：2

作者：

Chatzimichailidis, Avraam ^{[1
,3
]}

Pfreundt, Franz-Josef ^{[1
,2
]}

Gauger, Nicolas R. ^{[3
]}

Keuper, Janis ^{[1
,4
]}

机构：

[1] Fraunhofer ITWM, Competence Ctr High Performance Comp, Kaiserslautern, Germany

[2] Fraunhofer Ctr Machine Learning, Berlin, Germany

[3] TU Kaiserslautern, Chair Sci Comp, Kaiserslautern, Germany

[4] Offenburg Univ, Inst Machine Learning & Analyt, Offenburg, Germany

来源：

PROCEEDINGS OF 2019 5TH IEEE/ACM WORKSHOP ON MACHINE LEARNING IN HIGH PERFORMANCE COMPUTING ENVIRONMENTS (MLHPC 2019) | 2019年

关键词：

visualization; eigenvalues; parallelization; deeplearning; second-order;

D O I：

10.1109/MLHPC49564.2019.00012

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current training methods for deep neural networks boil down to very high dimensional and non-convex optimization problems which are usually solved by a wide range of stochastic gradient descent methods. While these approaches tend to work in practice, there are still many gaps in the theoretical understanding of key aspects like convergence and generalization guarantees, which are induced by the properties of the optimization surface (loss landscape). In order to gain deeper insights, a number of recent publications proposed methods to visualize and analyze the optimization surfaces. However, the computational cost of these methods are very high, making it hardly possible to use them on larger networks. In this paper, we present the GradVis Toolbox, an open source library for efficient and scalable visualization and analysis of deep neural network loss landscapes in Tensorflow and PyTorch. Introducing more efficient mathematical formulations and a novel parallelization scheme, GradVis allows to plot 2d and 3d projections of optimization surfaces and trajectories, as well as high resolution second order gradient information for large networks.

引用

页码：66 / 74

页数：9

共 50 条

[1] Second-order Derivative Optimization Methods in Deep Learning Neural Networks
Lim, Si Yong
Lim, King Hann
[J]. 2022 INTERNATIONAL CONFERENCE ON GREEN ENERGY, COMPUTING AND SUSTAINABLE TECHNOLOGY (GECOST), 2022, : 470 - 475
[2] DNNViz: Training Evolution Visualization for Deep Neural Networks
Clavien, Gil
Alberti, Michele
Pondenkandath, Vinaychandran
Ingold, Rolf
Liwicki, Marcus
[J]. 2019 6TH SWISS CONFERENCE ON DATA SCIENCE (SDS), 2019, : 19 - 24
[3] An Optimization Strategy for Deep Neural Networks Training
Wu, Tingting
Zeng, Peng
Song, Chunhe
[J]. 2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 596 - 603
[4] An Efficient Optimization Technique for Training Deep Neural Networks
Mehmood, Faisal
Ahmad, Shabir
Whangbo, Taeg Keun
[J]. MATHEMATICS, 2023, 11 (06)
[5] MULTICOMPOSITE NONCONVEX OPTIMIZATION FOR TRAINING DEEP NEURAL NETWORKS
Cui, Ying
He, Ziyu
Pang, Jong-Shi
[J]. SIAM JOURNAL ON OPTIMIZATION, 2020, 30 (02) : 1693 - 1723
[6] A Second Order Training Algorithm for Multilayer Feedforward Neural Networks
谭营
何振亚
邓超
[J]. Journal of Southeast University(English Edition), 1997, (01) : 32 - 36
[7] Visualization in Deep Neural Network Training
Kollias, Stefanos
[J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2022, 31 (03)
[8] TRAINING DEEP NEURAL NETWORKS VIA OPTIMIZATION OVER GRAPHS
Zhang, Guoqiang
Kleijn, W. Bastiaan
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4119 - 4123
[9] ASYNCHRONOUS STOCHASTIC OPTIMIZATION FOR SEQUENCE TRAINING OF DEEP NEURAL NETWORKS
Heigold, Georg
McDermott, Erik
Vanhoucke, Vincent
Senior, Andrew
Bacchiani, Michiel
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[10] Generalized backpropagation algorithm for training second-order neural networks
Fan, Fenglei
Cong, Wenxiang
Wang, Ge
[J]. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, 2018, 34 (05)

← 1 2 3 4 5 →