Evaluation of successive CPUs/APUs/GPUs based on an OpenCL finite difference stencil

被引:4
|
作者
Calandra, Henri [2 ]
Dolbeau, Romain [3 ]
Fortin, Pierre [1 ]
Lamotte, Jean-Luc [1 ]
Said, Issam [1 ]
机构
[1] Univ Paris 06, UPMC, CNRS, LIP6,UMR7606, 4 Pl Jussieu, F-75252 Paris 05, France
[2] Total, F-64000 Pau, France
[3] CAPS Entreprise, F-35000 Rennes, France
关键词
APU; GPU; finite difference stencil; PCI Express bus; high performance scientific computing;
D O I
10.1109/PDP.2013.65
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The AMD APU (Accelerated Processing Unit) architecture, which combines CPU and GPU cores on the same die, is promising for GPU applications which performance is bottlenecked by the low PCI Express communication rate. However the first APU generations still have different CPU and GPU memory partitions. Currently, the APU integrated GPUs are also less powerful than discrete GPUs. In this paper we therefore investigate the interest of APUs for scientific computing by evaluating and comparing the performance of two successive AMD APUs (family codename Llano and Trinity), two successive discrete GPUs (chip codename Cayman and Tahiti) and one hexa-core AMD CPU. For this purpose, we rely on a 3D finite difference stencil, that is optimized and tuned in OpenCL. We detail the most interesting optimizations for each architecture and show very good performance in OpenCL: up to 500 Gflops on Tahiti. Finally, our results show that APU integrated GPUs outperform CPUs, and that integrated GPUs of upcoming APUs may match discrete GPUs for problems with high communication requirements.
引用
收藏
页码:405 / 409
页数:5
相关论文
共 25 条
  • [1] Fractal Video Compression in OpenCL: An Evaluation of CPUs, GPUs, and FPGAs as Acceleration Platforms
    Chen, Doris
    Singh, Deshanand
    [J]. 2013 18TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2013, : 297 - 304
  • [2] Evaluation of Distributed Tasks in Stencil-based Application on GPUs
    Raut, Eric
    Anderson, Jonathon
    Araya-Polo, Mauricio
    Meng, Jie
    [J]. PROCEEDINGS OF SIXTH INTERNATIONAL IEEE WORKSHOP ON EXTREME SCALE PROGRAMMING MODELS AND MIDDLEWARE (ESPM2 2021), 2021, : 45 - 52
  • [3] Source wavefield reconstruction based on a new finite-difference stencil and infinity norm
    Bao, Qianzong
    Dai, Xue
    Liang, Xue
    [J]. Shiyou Diqiu Wuli Kantan/Oil Geophysical Prospecting, 2022, 57 (06): : 1384 - 1394
  • [4] Octant-Based Stencil Selection for Meshless Finite Difference Methods in 3D
    Davydov, Oleg
    Dang Thi Oanh
    Tuong Manh Ngo
    [J]. VIETNAM JOURNAL OF MATHEMATICS, 2020, 48 (01) : 93 - 106
  • [5] Octant-Based Stencil Selection for Meshless Finite Difference Methods in 3D
    Oleg Davydov
    Dang Thi Oanh
    Ngo Manh Tuong
    [J]. Vietnam Journal of Mathematics, 2020, 48 : 93 - 106
  • [6] Optimized finite-difference time-domain methods based on the (2,4) stencil
    Sun, GL
    Trueman, CW
    [J]. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2005, 53 (03) : 832 - 842
  • [7] Performance evaluation of a 3D multi-view-based particle filter for visual object tracking using GPUs and multicore CPUs
    David Concha
    Raúl Cabido
    Juan José Pantrigo
    Antonio S. Montemayor
    [J]. Journal of Real-Time Image Processing, 2018, 15 : 309 - 327
  • [8] Performance evaluation of a 3D multi-view-based particle filter for visual object tracking using GPUs and multicore CPUs
    Concha, David
    Cabido, Raul
    Jose Pantrigo, Juan
    Montemayor, Antonio S.
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2018, 15 (02) : 309 - 327
  • [9] Performance Evaluation of the Three-Dimensional Finite-Difference Time-Domain(FDTD) Method on Fermi Architecture GPUs
    Hou, Kaixi
    Zhao, Ying
    Huang, Jiumei
    Zhang, Lingjie
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PT I: ICA3PP 2011, 2011, 7916 : 460 - 469
  • [10] Accelerating simulations of light scattering based on Finite-Difference Time-Domain method with general purpose GPUs
    Balevic, A.
    Rockstroh, L.
    Tausendfreund, A.
    Patzelt, S.
    Goch, G.
    Simon, S.
    [J]. CSE 2008:11TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING, PROCEEDINGS, 2008, : 327 - +