Exploiting Parallelism in the Simulation of General Purpose Graphics Processing Unit Program

被引：0

作者：

赵夏 ^{[1
,2
]}

马胜 ^{[1
,2
]}

陈微 ^{[1
,2
]}

王志英 ^{[1
,2
]}

机构：

[1] State Key Laboratory of High Performance Computing

[2] College of Computer,National University of Defense Technology

来源：

JournalofShanghaiJiaotongUniversity(Science) | 2016年 / 21卷 / 03期

基金：

国家教育部博士点专项基金资助; 高等学校博士学科点专项科研基金; 中国国家自然科学基金;

关键词：

general purpose graphics processing unit(GPGPU); multicore; intra-kernel; inter-kernel; parallel;

D O I：

暂无

中图分类号：

TP391.41 [];

学科分类号：

080203 ;

摘要：

The simulation is an important means of performance evaluation of the computer architecture. Nowadays, the serial simulation of general purpose graphics processing unit(GPGPU) architecture is the main bottleneck for the simulation speed. To address this issue, we propose the intra-kernel parallelization on a multicore processor and the inter-kernel parallelization on a multiple-machine platform. We apply these two methods to the GPGPU-sim simulator. The intra-kernel parallelization method firstly parallelizes the serial simulation of multiple compute units in one cycle. Then it parallelizes the timing and functional simulation to reduce the performance loss caused by the synchronization between different compute units. The inter-kernel parallelization method divides multiple kernels of a CUDA program into several groups and distributes these groups across multiple simulation hosts to perform the simulation. Experimental results show that the intra-kernel parallelization method achieves a speed-up of up to 12 with a maximum error rate of 0.009 4% on a 32-core machine, and the inter-kernel parallelization method can accelerate the simulation by a factor of up to 3.9 with a maximum error rate of 0.11% on four simulation hosts. The orthogonality between these two methods allows us to combine them together on multiple multi-core hosts to get further performance improvements.

引用

页码：280 / 288

页数：9

共 50 条

[1] Exploiting parallelism in the simulation of general purpose graphics processing unit program
Zhao X.
Ma S.
Chen W.
Wang Z.
Journal of Shanghai Jiaotong University (Science), 2016, 21 (03) : 280 - 288
[2] General purpose computing of graphics processing unit: A survey
Wang, Hai-Feng
Chen, Qing-Kui
Jisuanji Xuebao/Chinese Journal of Computers, 2013, 36 (04): : 757 - 772
[3] Exploiting parallelism in general purpose optimization
Venter, G
Watson, B
APPLICATIONS OF HIGH-PERFORMANCE COMPUTING IN ENGINEERING VI, 2000, 6 : 21 - 30
[4] Parallel simulation for a fish schooling model on a general-purpose graphics processing unit
Li, Hong
Kolpas, Allison
Petzold, Linda
Moehlis, Jeff
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2009, 21 (06): : 725 - 737
[5] MPIE/MoM Acceleration With a General-Purpose Graphics Processing Unit
De Donno, Danilo
Esposito, Alessandra
Monti, Giuseppina
Tarricone, Luciano
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2012, 60 (09) : 2693 - 2701
[6] Implementation and performance of a general purpose graphics processing unit in hyperspectral image analysis
van der Werff, H. M. A.
Bakker, W. H.
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2014, 26 : 312 - 321
[7] State of the art and future challenge on general purpose computation by graphics processing unit
Wu, En-Hua
Ruan Jian Xue Bao/Journal of Software, 2004, 15 (10): : 1493 - 1504
[8] Exploiting parallelism in geometry processing with general purpose processors and floating-point SIMD instructions
Yang, CL
Sano, B
Lebeck, AR
IEEE TRANSACTIONS ON COMPUTERS, 2000, 49 (09) : 934 - 946
[9] PARALLEL IMPLEMENTATION OF AN ERROR DIFFUSION HALFTONING ALGORITHM WITH A GENERAL PURPOSE GRAPHICS PROCESSING UNIT
Seong, Becksang
Ahn, Jaewoo
Sung, Wonyong
2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 3777 - 3780
[10] MASSIVELY PARALLEL IMPLEMENTATION OF CYCLIC LDPC CODES ON A GENERAL PURPOSE GRAPHICS PROCESSING UNIT
Ji, Hyunwoo
Cho, Junho
Sung, Wonyong
SIPS: 2009 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, 2009, : 285 - 290

← 1 2 3 4 5 →