High Performance Parallelization of COMPSYN on a Cluster of Multicore Processors with GPUs

被引:1
|
作者
Alessi, Ferdinando [1 ]
Massini, Annalisa [1 ]
Basili, Roberto [2 ]
机构
[1] Sapienza Univ Rome, Dept Comp Sci, Rome, Italy
[2] Ist Nazl Geofis & Vulcanol, Rome, Italy
关键词
GPU; CUDA; synthetic seismogram;
D O I
10.1016/j.procs.2012.04.103
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this work we propose a high performance parallelization of the software package COMPSYN, devoted to the production of syntethic seismograms, on a cluster of multicore processors with multiple GPUs. To design and implement the proposed high performance version, we started from a naive parallel version of COMPSYN. The naive version consists in a simple parallelization on both device side, obtained by exploiting CUDA, and host side, obtained by exploiting the MPI paradigm and OpenMP API. The proposed high performance version implements several practical techniques of CUDA programming and deeply exploits the GPU architecture, thus achieving a much better performance with respect to the naive version. We compare the performance of the proposed high performance version and that of the naive one with the performance of the version running on the cluster of multicore processors without invoking the GPUs. We obtain for the high performance GPU version a speedup of 25x over the version running on the cluster of multicore processors without GPUs against the 10x of the naive version. Regarding the sequential version, we estimate about 380x the speedup of the high performance GPU version against the about 140x of the naive version.
引用
收藏
页码:966 / 975
页数:10
相关论文
共 50 条
  • [1] Parallelization of PageRank on Multicore Processors
    Kumar, Tarun
    Sondhi, Parikshit
    Mittal, Ankush
    [J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, 2012, 7154 : 129 - +
  • [2] A Performance and Energy Comparison of Convolution on GPUs, FPGAs, and Multicore Processors
    Fowers, Jeremy
    Brown, Greg
    Wernsing, John
    Stitt, Greg
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 9 (04)
  • [3] Automatic Program Parallelization for Multicore Processors
    Kwiatkowski, Jan
    Iwaszyn, Radoslaw
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I, 2010, 6067 : 236 - 245
  • [4] Parallelization of Data Mining Algorithms for Multicore Processors
    Kholod, Ivan
    Kuprianov, Mikhail
    Shorov, Andrey
    [J]. 2015 4TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2015, : 262 - 267
  • [5] Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL
    Ferrer, Roger
    Planas, Judit
    Bellens, Pieter
    Duran, Alejandro
    Gonzalez, Marc
    Martorell, Xavier
    Badia, Rosa M.
    Ayguade, Eduard
    Labarta, Jesus
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2011, 6548 : 215 - +
  • [6] Multicore processors and GPUs: the power of parallel computing in the Cloud
    Bennett, Kelly W.
    Robertson, James
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS II, 2020, 11413
  • [7] Power Regulation in High Performance Multicore Processors
    Chen, X.
    Wardi, Y.
    Yalamanchili, S.
    [J]. 2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [8] Mode based Parallelization for Simulink Models on Multicore CPUs and GPUs
    Zhong, Zhaoqian
    Edahiro, Masato
    [J]. 2019 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2019, : 103 - 104
  • [9] High Performance and Portable Convolution Operators for Multicore Processors
    San Juan, Pablo
    Castello, Adrian
    Dolz, Manuel F.
    Alonso-Jorda, Pedro
    Quintana-Orti, Enrique S.
    [J]. 2020 IEEE 32ND INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2020), 2020, : 91 - 98
  • [10] Performance analysis and multicore processors
    Carleton, G
    Shands, W
    [J]. DR DOBBS JOURNAL, 2006, 31 (05): : 22 - +