An approach of performance comparisons with OpenMP and CUDA parallel programming on multicore systems

被引:3
|
作者
Chang, Chih-Hung [1 ]
Lu, Chih-Wei [1 ]
Yang, Chao-Tung [2 ]
Chang, Tzu-Chieh [2 ]
机构
[1] Hsiuping Univ Sci & Technol, Dept Informat Management, Taichung, Taiwan
[2] Tunghai Univ, Dept Comp Sci, Taichung, Taiwan
来源
关键词
auto-parallel; parallel programming; multicore; OpenMP; CUDA;
D O I
10.1002/cpe.3829
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the past, the tenacious semiconductor problems of operating temperature and power consumption limited the performance growth for single-core microprocessors. Microprocessor vendors hence adopt the multicore chip organizations with parallel processing because the new technology promises faster and lower power needed. In a short time, this trend floods first the development of CPU, then also the other peripherals like GPU. Modern GPUs are very efficient in manipulating computer graphics, and their highly parallel structure makes them even more effective than general-purpose CPUs for a range of graphical complex algorithms. However, technology of multicore processor brought revolution and unavoidable collision to the programming personnel. Multicore processor has high performance; however, parallel processing brings not only the opportunity but also a challenge. The issue of efficiency and the way how programmer or compiler parallelizes the software explicitly are the keys that enhance the performance on multicore chip. In this paper, we propose a parallel programming approach using hybrid CUDA, OpenMP, and MPI programming. There would be two verificational experiments presented in the paper. In the first, we would verify the availability and correctness of the auto-parallel tools, and discuss the performance issues on CPU, GPU, and embedded system. In the second, we would verify how the hybrid programming could surely improve performance. Copyright (C) 2016 John Wiley & Sons, Ltd.
引用
收藏
页码:4230 / 4245
页数:16
相关论文
共 50 条
  • [31] OpenMP task scheduling strategies for multicore NUMA systems
    Olivier, Stephen L.
    Porterfield, Allan K.
    Wheeler, Kyle B.
    Spiegel, Michael
    Prins, Jan F.
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2012, 26 (02): : 110 - 124
  • [32] 3D-DRAM Performance for Different OpenMP Scheduling Techniques in Multicore Systems
    Adavally, Shashank
    Kavi, Krishna
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 675 - 683
  • [33] Performance Evaluation of OpenMP Applications on Virtualized Multicore Machines
    Tao, Jie
    Fuerlinger, Karl
    Marten, Holger
    OPENMP IN THE PETASCALE ERA, (IWOMP 2011), 2011, 6665 : 138 - 150
  • [34] Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures
    Mallon, Damian A.
    Taboada, Guillermo L.
    Teijeiro, Carlos
    Tourino, Juan
    Fraguela, Basilio B.
    Gomez, Andres
    Doallo, Ramon
    Carlos Mourino, J.
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2009, 5759 : 174 - +
  • [35] OpenMP Thread Affinity for Matrix Factorization on Multicore Systems
    Bylina, Beata
    Bylina, Jaroslaw
    PROCEEDINGS OF THE 2017 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2017, : 489 - 492
  • [36] Implementing OpenMP on a high performance embedded multicore MPSoC
    Chapman, Barbara
    Huang, Lei
    Biscondi, Eric
    Stotzer, Eric
    Shrivastava, Ashish
    Gatherer, Alan
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 2107 - +
  • [37] A multithreaded CUDA and OpenMP based power-aware programming framework for multi-node GPU systems
    Czarnul, Pawel
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023,
  • [38] A multithreaded CUDA and OpenMP based power-aware programming framework for multi-node GPU systems
    Czarnul, Pawel
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (25):
  • [39] Parallel programming for OSEM reconstruction with MPI, OpenMP, and hybrid MPI-OpenMP
    Jones, MD
    Yao, RT
    2004 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOLS 1-7, 2004, : 3036 - 3042
  • [40] Parallel Computation of Aerial Target Reflection of Background Infrared Radiation: Performance Comparison of OpenMP, OpenACC, and CUDA Implementations
    Guo, Xing
    Wu, Jiaji
    Wu, Zhensen
    Huang, Bormin
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2016, 9 (04) : 1653 - 1662