Neural Networks Training on Graphics Processing Unit (GPU) Using Dynamic Parallelism (DP)

被引：0

作者：

Hall, Will ^{[1
]}

Tian, Yun ^{[1
]}

机构：

[1] Eastern Washington Univ, Spokane, WA 99201 USA

来源：

INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2 | 2023年 / 543卷

关键词：

Neural network training; GPU; CUDA; Performance; Dynamic parallelism; MEMORY;

D O I：

10.1007/978-3-031-16078-3_56

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Artificial Neural Networks (ANN) are a crucial foundation for deep learning and many machine learning algorithms. Training an ANN is computationally intensive and inherently parallel, thus may be accelerated by a Graphics Processing Unit (GPU). Due to the dependency across different ANN layers, which is created by the nature of Back Propagation (BP) algorithm, it is quite challenging to design a highly efficient ANN training algorithm on GPU. In this work, we investigate and demonstrate the technology, Dynamic Parallelism (DP) and will further speed up an ANN training task on GPU. We implemented a generic ANN framework on GPU that consists of an arbitrary number of layers and an arbitrary number of nodes in each layer. In two sets of experiments, we trained the generic ANN on GPU for handwritten digit recognition with DP enabled and disabled. We observed that training ANNs on GPU with DP enabled achieved up to 12.7x performance gain, compared with that with DP disabled on GPU. After being trained on GPU, our neural network achieved an accuracy rate of 96% in handwritten digit recognition.

引用

页码：811 / 818

页数：8

共 50 条

[21] Exploiting Parallelism in the Simulation of General Purpose Graphics Processing Unit Program
赵夏
马胜
陈微
王志英
JournalofShanghaiJiaotongUniversity(Science), 2016, 21 (03) : 280 - 288
[22] Exploiting parallelism in the simulation of general purpose graphics processing unit program
Zhao X.
Ma S.
Chen W.
Wang Z.
Journal of Shanghai Jiaotong University (Science), 2016, 21 (03) : 280 - 288
[23] SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks
Wang, Linnan
Ye, Jinmian
Zhao, Yiyang
Wu, Wei
Li, Ang
Song, Shuaiwen Leon
Xu, Zenglin
Kraska, Tim
ACM SIGPLAN NOTICES, 2018, 53 (01) : 41 - 53
[24] SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks
Wang L.
Ye J.
Zhao Y.
Wu W.
Li A.
Song S.L.
Xu Z.
Kraska T.
2018, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (53): : 41 - 53
[25] GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Huang, Yanping
Cheng, Youlong
Bapna, Ankur
Firat, Orhan
Chen, Mia Xu
Chen, Dehao
Lee, HyoukJoong
Ngiam, Jiquan
Le, Quoc V.
Wu, Yonghui
Chen, Zhifeng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[26] Performance Evaluation of STBC-OFDM WiMAX System using Graphics Processing Unit (GPU)
Yadav, Satyendra Singh
Patra, Sarat Kumar
2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
[27] Fast analytical modeling of compton scatter using point clouds and graphics processing unit (GPU)
Sitek, Arkadiusz
El Fakhri, Georges
Ouyang, Jinsong
Maltz, Jonathan S.
2007 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOLS 1-11, 2007, : 4546 - +
[28] An Efficient Particle Filter-based Tracking Method Using Graphics Processing Unit (GPU)
Li, Peihua
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2012, 68 (03): : 317 - 332
[29] Digital design of a dedicated Graphics Processing Unit (GPU) architecture for microcontrollers
Zafar, Saad
Kataria, Sushant
Sharma, Abhishek
2014 INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS), 2014,
[30] Storage System Design in Graphics Processing Unit (GPU) Based on PCM
Wang, Shiyu
2014 2ND INTERNATIONAL CONFERENCE ON SOCIAL SCIENCE AND HEALTH (ICSSH 2014), PT 2, 2014, 56 : 270 - 273

← 1 2 3 4 5 →