An approach of performance comparisons with OpenMP and CUDA parallel programming on multicore systems

被引:3
|
作者
Chang, Chih-Hung [1 ]
Lu, Chih-Wei [1 ]
Yang, Chao-Tung [2 ]
Chang, Tzu-Chieh [2 ]
机构
[1] Hsiuping Univ Sci & Technol, Dept Informat Management, Taichung, Taiwan
[2] Tunghai Univ, Dept Comp Sci, Taichung, Taiwan
来源
关键词
auto-parallel; parallel programming; multicore; OpenMP; CUDA;
D O I
10.1002/cpe.3829
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the past, the tenacious semiconductor problems of operating temperature and power consumption limited the performance growth for single-core microprocessors. Microprocessor vendors hence adopt the multicore chip organizations with parallel processing because the new technology promises faster and lower power needed. In a short time, this trend floods first the development of CPU, then also the other peripherals like GPU. Modern GPUs are very efficient in manipulating computer graphics, and their highly parallel structure makes them even more effective than general-purpose CPUs for a range of graphical complex algorithms. However, technology of multicore processor brought revolution and unavoidable collision to the programming personnel. Multicore processor has high performance; however, parallel processing brings not only the opportunity but also a challenge. The issue of efficiency and the way how programmer or compiler parallelizes the software explicitly are the keys that enhance the performance on multicore chip. In this paper, we propose a parallel programming approach using hybrid CUDA, OpenMP, and MPI programming. There would be two verificational experiments presented in the paper. In the first, we would verify the availability and correctness of the auto-parallel tools, and discuss the performance issues on CPU, GPU, and embedded system. In the second, we would verify how the hybrid programming could surely improve performance. Copyright (C) 2016 John Wiley & Sons, Ltd.
引用
收藏
页码:4230 / 4245
页数:16
相关论文
共 50 条
  • [21] CUDA: Scalable parallel programming for high-performance scientific computing
    Luebke, David
    2008 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, VOLS 1-4, 2008, : 836 - 838
  • [22] PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP
    Dongarra, Jack
    Gates, Mark
    Haidar, Azzam
    Kurzak, Jakub
    Luszczek, Piotr
    Wu, Panruo
    Yamazaki, Ichitaro
    Yarkhan, Asim
    Abalenkovs, Maksims
    Bagherpour, Negin
    Hammarling, Sven
    Sistek, Jakub
    Stevens, David
    Zounon, Mawussi
    Relton, Samuel D.
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2019, 45 (02):
  • [23] Optimisation Techniques for Multicore Architectures and Parallel Processing using OpenMP
    Ataullah, Sara Tabassum
    Siddique, Mohammed
    2021 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATION (DASA), 2021,
  • [24] Parallel Divide-and-Evolve: Experiments with OpenMP on a Multicore Machine
    Candan, Caner
    Dreo, Johann
    Saveant, Pierre
    Vidal, Vincent
    GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 1571 - 1578
  • [25] Concurrent Parallel Processing on Graphics and Multicore Processors with OpenACC and OpenMP
    Stone, Christopher P.
    Davis, Roger L.
    Lee, Daryl Y.
    ACCELERATOR PROGRAMMING USING DIRECTIVES, WACCPD 2017, 2018, 10732 : 103 - 122
  • [26] Parallel Implementation of Doolittle Algorithm Using OpenMP for Multicore Machines
    Mustafa, B.
    Shahana, Rafiya
    Ahmed, Waseem
    2015 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2015, : 575 - 578
  • [27] Towards High-Level Parallel Programming Models for Multicore Systems
    Marowka, Ami
    PROCEEDINGS OF THE 2008 ADVANCED SOFTWARE ENGINEERING & ITS APPLICATIONS, 2008, : 226 - 229
  • [28] Designing a parallel algorithm for Heat Conduction using MPI, OpenMP and CUDA
    Sivanandan, Vinaya
    Kumar, Vikas
    Meher, Srisai
    2015 NATIONAL CONFERENCE ON PARALLEL COMPUTING TECHNOLOGIES (PARCOMPTECH 2015), 2015,
  • [29] SIMPLE PERFORMANCE BOUNDS FOR MULTICORE AND PARALLEL CHANNEL SYSTEMS
    Gamboa, Carlos Fernando
    Robertazzi, Thomas
    PARALLEL PROCESSING LETTERS, 2011, 21 (04) : 439 - 460
  • [30] Designing a parallel algorithm for Heat Conduction using MPI, OpenMP and CUDA
    Sivanandan, Vinaya
    Kumar, Vikas
    Meher, Srisai
    2015 IEEE INTERNATIONAL CONFERENCE ON MICROELECTRONICS SYSTEMS EDUCATION (MSE), 2015,