Efficient Neural Network Training via Forward and Backward Propagation Sparsification

被引:0
|
作者
Zhou, Xiao [1 ]
Zhang, Weizhong [1 ]
Chen, Zonghao [2 ]
Diao, Shizhe [1 ]
Zhang, Tong [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sparse training is a natural idea to accelerate the training speed of deep neural networks and save the memory usage, especially since large modern neural networks are significantly over-parameterized. However, most of the existing methods cannot achieve this goal in practice because the chain rule based gradient (w.r.t. structure parameters) estimators adopted by previous methods require dense computation at least in the backward propagation step. This paper solves this problem by proposing an efficient sparse training method with completely sparse forward and backward passes. We first formulate the training process as a continuous minimization problem under global sparsity constraint. We then separate the optimization process into two steps, corresponding to weight update and structure parameter update. For the former step, we use the conventional chain rule, which can be sparse via exploiting the sparse structure. For the latter step, instead of using the chain rule based gradient estimators as in existing methods, we propose a variance reduced policy gradient estimator, which only requires two forward passes without backward propagation, thus achieving completely sparse training. We prove that the variance of our gradient estimator is bounded. Extensive experimental results on real-world datasets demonstrate that compared to previous methods, our algorithm is much more effective in accelerating the training process, up to an order of magnitude faster.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Feed Forward Neural Network Sparsification with Dynamic Pruning
    Chouliaras, Andreas
    Fragkou, Evangelia
    Katsaros, Dimitrios
    [J]. 25TH PAN-HELLENIC CONFERENCE ON INFORMATICS WITH INTERNATIONAL PARTICIPATION (PCI2021), 2021, : 12 - 17
  • [2] Training Recurrent Neural Networks via Forward Propagation Through Time
    Kag, Anil
    Saligrama, Venkatesh
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [3] Array Aware Training/Pruning: Methods for Efficient Forward Propagation on Array-based Neural Network Accelerators
    Chitty-Venkata, Krishna Teja
    Somani, Arun K.
    [J]. 2020 IEEE 31ST INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2020), 2020, : 37 - 44
  • [4] The Hessian by blocks for neural network by backward propagation
    Bessi, Radhia
    Gmati, Nabil
    [J]. JOURNAL OF TAIBAH UNIVERSITY FOR SCIENCE, 2024, 18 (01):
  • [5] Forward-forward training of an optical neural network
    Oguz, Ilker
    Ke, Junjie
    Weng, Qifei
    Yang, Feng
    Yildirim, Mustafa
    Dinc, Niyazi Ulas
    Hsieh, Jih-Liang
    Moser, Christophe
    Psaltis, Demetri
    [J]. OPTICS LETTERS, 2023, 48 (20) : 5249 - 5252
  • [6] Efficient Robust Training via Backward Smoothing
    Chen, Jinghui
    Cheng, Yu
    Gan, Zhe
    Gu, Quanquan
    Liu, Jingjing
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6222 - 6230
  • [7] A PARALLEL IMPLEMENTATION OF THE BACKWARD ERROR PROPAGATION NEURAL NETWORK TRAINING ALGORITHM - EXPERIMENTS IN EVENT IDENTIFICATION
    SITTIG, DF
    ORR, JA
    [J]. COMPUTERS AND BIOMEDICAL RESEARCH, 1992, 25 (06): : 547 - 561
  • [8] Combining Forward and Backward Propagation
    Zaki, Amira
    Abdennadher, Slim
    Fruehwirth, Thom
    [J]. FRONTIERS OF COMBINING SYSTEMS, FROCOS 2015, 2015, 9322 : 307 - 322
  • [9] Efficient Recurrent Neural Networks via Importance-Based Sparsification
    Ren, Jiankang
    Ni, Zheng
    Su, Xiaoyan
    Zhang, Haijun
    Li, Haifang
    Li, Shengyu
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2024,
  • [10] Diagnosis of Neural Network via Backward Deduction
    Yin, Peifeng
    Huang, Lei
    Lee, Sunhwan
    Qiao, Mu
    Asthana, Shubhi
    Nakamura, Tagiga
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 260 - 267