Sketch-Based Empirical Natural Gradient Methods for Deep Learning

被引:0
|
作者
Minghan Yang
Dong Xu
Zaiwen Wen
Mengyun Chen
Pengxiang Xu
机构
[1] Peking University,Beijing International Center for Mathematical Research
[2] Peking University,School of Mathematical Sciences
[3] Peking University,Beijing International Center for Mathematical Research, College of Engineering and Center for Data Science
[4] Huawei Technologies Co. Ltd,undefined
[5] Peng Cheng Laboratory,undefined
来源
关键词
Deep learning; Natural gradient methods; Sketch-based methods; Convergence; 90C06; 90C26;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we develop an efficient sketch-based empirical natural gradient method (SENG) for large-scale deep learning problems. The empirical Fisher information matrix is usually low-rank since the sampling is only practical on a small amount of data at each iteration. Although the corresponding natural gradient direction lies in a small subspace, both the computational cost and memory requirement are still not tractable due to the high dimensionality. We design randomized techniques for different neural network structures to resolve these challenges. For layers with a reasonable dimension, sketching can be performed on a regularized least squares subproblem. Otherwise, since the gradient is a vectorization of the product between two matrices, we apply sketching on the low-rank approximations of these matrices to compute the most expensive parts. A distributed version of SENG is also developed for extremely large-scale applications. Global convergence to stationary points is established under mild assumptions and a fast linear convergence is analyzed under the neural tangent kernel (NTK) case. Extensive experiments on convolutional neural networks show the competitiveness of SENG compared with the state-of-the-art methods. On the task ResNet50 with ImageNet-1k, SENG achieves 75.9% Top-1 testing accuracy within 41 epochs. Experiments on the distributed large-batch training Resnet50 with ImageNet-1k show that the scaling efficiency is quite reasonable.
引用
收藏
相关论文
共 50 条
  • [1] Sketch-Based Empirical Natural Gradient Methods for Deep Learning
    Yang, Minghan
    Xu, Dong
    Wen, Zaiwen
    Chen, Mengyun
    Xu, Pengxiang
    [J]. JOURNAL OF SCIENTIFIC COMPUTING, 2022, 92 (03)
  • [2] CASQ: Accelerate Distributed Deep Learning with Sketch-Based Gradient Quantization
    Ge, Keshi
    Zhang, Yiming
    Fu, Yongquan
    Lai, Zhiquan
    Deng, Xiaoge
    Li, Dongsheng
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, : 825 - 826
  • [3] Sketch-based histogram of orientation gradient for face sketch recognition
    Li, Weihong
    Fu, Weifeng
    Zhang, Zhen
    Gong, Weiguo
    [J]. Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2015, 36 (02): : 368 - 376
  • [4] Diagram Image Retrieval using Sketch-Based Deep Learning and Transfer Learning
    Bhattarai, Manish
    Oyen, Diane
    Castorena, Juan
    Yang, Liping
    Wohlberg, Brendt
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 663 - 672
  • [5] Deep Sketch-Based Modeling: Tips and Tricks
    Zhong, Yue
    Gryaditskaya, Yulia
    Zhang, Honggang
    Song, Yi-Zhe
    [J]. 2020 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2020), 2020, : 543 - 552
  • [6] Users Personalized Sketch-Based Image Retrieval Using Deep Transfer Learning
    Huo, Qiming
    Wang, Jingyu
    Qi, Qi
    Sun, Haifeng
    Ge, Ce
    Zhao, Yu
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2018), PT I, 2018, 11061 : 160 - 168
  • [7] Sketch-based evaluation of image segmentation methods
    Gavilan, David
    Takahashi, Hiroki
    Saito, Suguru
    Nakajima, Masayuki
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (01) : 156 - 164
  • [8] FaceShop: Deep Sketch-based Face Image Editing
    Portenier, Tiziano
    Hu, Qiyang
    Szabo, Attila
    Bigdeli, Siavash Arjomand
    Favaro, Paolo
    Zwicker, Matthias
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
  • [9] SketchHairSalon: Deep Sketch-based Hair Image Synthesis
    Xiao, Chufeng
    Yu, Deng
    Han, Xiaoguang
    Zheng, Youyi
    Fu, Hongbo
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (06):
  • [10] DeepFaceVideoEditing: Sketch-based Deep Editing of Face Videos
    Liu, Feng-Lin
    Chen, Shu-Yu
    Lai, Yu-Kun
    Li, Chunpeng
    Jiang, Yue-Ren
    Fu, Hongbo
    Gao, Lin
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):