Neural Networks Training on Graphics Processing Unit (GPU) Using Dynamic Parallelism (DP)

被引:0
|
作者
Hall, Will [1 ]
Tian, Yun [1 ]
机构
[1] Eastern Washington Univ, Spokane, WA 99201 USA
关键词
Neural network training; GPU; CUDA; Performance; Dynamic parallelism; MEMORY;
D O I
10.1007/978-3-031-16078-3_56
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial Neural Networks (ANN) are a crucial foundation for deep learning and many machine learning algorithms. Training an ANN is computationally intensive and inherently parallel, thus may be accelerated by a Graphics Processing Unit (GPU). Due to the dependency across different ANN layers, which is created by the nature of Back Propagation (BP) algorithm, it is quite challenging to design a highly efficient ANN training algorithm on GPU. In this work, we investigate and demonstrate the technology, Dynamic Parallelism (DP) and will further speed up an ANN training task on GPU. We implemented a generic ANN framework on GPU that consists of an arbitrary number of layers and an arbitrary number of nodes in each layer. In two sets of experiments, we trained the generic ANN on GPU for handwritten digit recognition with DP enabled and disabled. We observed that training ANNs on GPU with DP enabled achieved up to 12.7x performance gain, compared with that with DP disabled on GPU. After being trained on GPU, our neural network achieved an accuracy rate of 96% in handwritten digit recognition.
引用
收藏
页码:811 / 818
页数:8
相关论文
共 50 条
  • [1] Evolution of the Graphics Processing Unit (GPU)
    Dally, William J.
    Keckler, Stephen W.
    Kirk, David B.
    IEEE MICRO, 2021, 41 (06) : 42 - 51
  • [2] Neural network training with Extended Kalman filter using graphics processing unit
    Trebaticky, Peter
    Pospichal, Jiri
    ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT II, 2008, 5164 : 198 - 207
  • [3] Parallel Backpropagation Neural Network Training Techniques using Graphics Processing Unit
    Amin, Muhammad Arslan
    Hanif, Muhammad Kashif
    Sarwar, Muhammad Umer
    Rehman, Abdur
    Waheed, Fiaz
    Rehman, Haseeb
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (02) : 563 - 566
  • [4] Towards using the Graphics Processing Unit (GPU) for Embedded Systems
    Hallmans, Daniel
    Asberg, Mikael
    Nolte, Thomas
    2012 IEEE 17TH CONFERENCE ON EMERGING TECHNOLOGIES & FACTORY AUTOMATION (ETFA), 2012,
  • [5] Acceleration of Hough Transform Algorithm using Graphics Processing Unit (GPU)
    Patil, Parag Ram
    Patil, Mukesh D.
    Vyawahare, Vishwesh A.
    2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 1584 - 1588
  • [6] Fast calculation of HELAS amplitudes using graphics processing unit (GPU)
    K. Hagiwara
    J. Kanzaki
    N. Okamura
    D. Rainwater
    T. Stelzer
    The European Physical Journal C, 2010, 66 : 477 - 492
  • [7] Cellular Neural Networks Simulation on a Parallel Graphics Processing Unit
    Fernandez, Andres
    Martin, Ruben San
    Farguell, Enric
    Pazienza, Giovanni Egidio
    2008 11TH INTERNATIONAL WORKSHOP ON CELLULAR NEURAL NETWORKS AND THEIR APPLICATIONS, 2008, : 208 - +
  • [8] Fast calculation of HELAS amplitudes using graphics processing unit (GPU)
    Hagiwara, K.
    Kanzaki, J.
    Okamura, N.
    Rainwater, D.
    Stelzer, T.
    EUROPEAN PHYSICAL JOURNAL C, 2010, 66 (3-4): : 477 - 492
  • [9] Graphics processing unit (GPU) programming strategies and trends in GPU computing
    Brodtkorb, Andre R.
    Hagen, Trond R.
    Saetra, Martin L.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (01) : 4 - 13
  • [10] Calculation of HELAS amplitudes for QCD processes using graphics processing unit (GPU)
    K. Hagiwara
    J. Kanzaki
    N. Okamura
    D. Rainwater
    T. Stelzer
    The European Physical Journal C, 2010, 70 : 513 - 524