An approach of performance comparisons with OpenMP and CUDA parallel programming on multicore systems

被引:3
|
作者
Chang, Chih-Hung [1 ]
Lu, Chih-Wei [1 ]
Yang, Chao-Tung [2 ]
Chang, Tzu-Chieh [2 ]
机构
[1] Hsiuping Univ Sci & Technol, Dept Informat Management, Taichung, Taiwan
[2] Tunghai Univ, Dept Comp Sci, Taichung, Taiwan
来源
关键词
auto-parallel; parallel programming; multicore; OpenMP; CUDA;
D O I
10.1002/cpe.3829
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the past, the tenacious semiconductor problems of operating temperature and power consumption limited the performance growth for single-core microprocessors. Microprocessor vendors hence adopt the multicore chip organizations with parallel processing because the new technology promises faster and lower power needed. In a short time, this trend floods first the development of CPU, then also the other peripherals like GPU. Modern GPUs are very efficient in manipulating computer graphics, and their highly parallel structure makes them even more effective than general-purpose CPUs for a range of graphical complex algorithms. However, technology of multicore processor brought revolution and unavoidable collision to the programming personnel. Multicore processor has high performance; however, parallel processing brings not only the opportunity but also a challenge. The issue of efficiency and the way how programmer or compiler parallelizes the software explicitly are the keys that enhance the performance on multicore chip. In this paper, we propose a parallel programming approach using hybrid CUDA, OpenMP, and MPI programming. There would be two verificational experiments presented in the paper. In the first, we would verify the availability and correctness of the auto-parallel tools, and discuss the performance issues on CPU, GPU, and embedded system. In the second, we would verify how the hybrid programming could surely improve performance. Copyright (C) 2016 John Wiley & Sons, Ltd.
引用
收藏
页码:4230 / 4245
页数:16
相关论文
共 50 条
  • [41] Performance and Power Comparisons of MPI Vs Pthread Implementations on Multicore Systems
    Asaduzzaman, Abu
    Sibai, Fadi N.
    Aramco, Saudi
    El-Sayed, Hesham
    2013 9TH INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY (IIT), 2013,
  • [42] Performance of multicore systems on parallel data clustering with deterministic annealing
    Qiu, Xiaohong
    Fox, Geoffrey C.
    Yuan, Huapeng
    Bae, Seung-Hee
    Chrysanthakopoulos, George
    Nielsen, Henrik Frystyk
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 1, 2008, 5101 : 407 - +
  • [43] PARALLEL PROGRAMMING MODELS FOR HETEROGENEOUS MULTICORE ARCHITECTURES
    Ferrer, Roger
    Bellens, Pieter
    Beltran, Vicenc
    Gonzalez, Marc
    Martorell, Xavier
    Badia, Rosa M.
    Ayguade, Eduard
    Yeom, Jae-Seung
    Schneider, Scott
    Koukos, Konstantinos
    Alvanos, Michail
    Nikolopoulos, Dimitrios S.
    Bilas, Angelos
    IEEE MICRO, 2010, 30 (05) : 42 - 53
  • [44] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
    Chao-Chin Wu
    Lien-Fu Lai
    Chao-Tung Yang
    Po-Hsun Chiu
    The Journal of Supercomputing, 2012, 60 : 31 - 61
  • [45] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
    Wu, Chao-Chin
    Lai, Lien-Fu
    Yang, Chao-Tung
    Chiu, Po-Hsun
    JOURNAL OF SUPERCOMPUTING, 2012, 60 (01): : 31 - 61
  • [46] Compute units in OpenMP: Extensions for heterogeneous parallel programming
    Gonzalez-Tallada, Marc
    Morancho, Enric
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (01):
  • [47] An OpenMP-like interface for parallel programming in Java
    Kambites, M.E.
    Obdržálek, J.
    Bull, J.M.
    Concurrency and Computation: Practice and Experience, 2001, 13 (8-9) : 793 - 814
  • [48] Programming Parallel Embedded and Consumer Applications in OpenMP Superscalar
    Andersch, Michael
    Chi, Chi Ching
    Juurlink, Ben
    ACM SIGPLAN NOTICES, 2012, 47 (08) : 281 - 282
  • [49] Fine-grained Parallel Solution for Solving Sparse Triangular Systems on Multicore Platform using OpenMP Interface
    Marrakchi, Sirine
    Jemni, Mohamed
    2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 659 - 666
  • [50] Task-parallel versus data-parallel library-based programming in multicore systems
    Andrade, Diego
    Fraguela, Basilio B.
    Brodman, James
    Padua, David
    PROCEEDINGS OF THE PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2009, : 101 - +