An Optimization of FMM under CPU plus GPU Heterogeneous Architecture

被引:0
|
作者
Zhu, Yonghua [1 ]
Lu, Xiao [2 ]
机构
[1] Shanghai Univ, Ctr Comp, Room321 Bldg D,99 Shangda Rd, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Sch Engn & Comp Sci, Shanghai 200072, Peoples R China
关键词
GPU; Heterogeneous Architecture; FMM; Threads Mapping Model;
D O I
10.1109/CEC.2012.33
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Heterogeneous architecture of CPU+GPU has been the main trend for high-performance computing/parallel processing in recent years. However, the formulation of scientific algorithms to take advantage of the performance offered by the new architecture requires rethinking core methods. The algorithmic acceleration is achieved with the main part of fast multipole method (FMM) under the heterogeneous architecture. Based on PetFMM, a Two Dimensional Threads Mapping Model (TDTMM) is proposed to lighten the workload per thread on GPU. The presented threads mapping model is able to improve the execution efficiency of hardware acceleration. Experiment results show that the presented models are feasible and effective.
引用
收藏
页码:147 / 150
页数:4
相关论文
共 50 条
  • [21] Performance Optimization by Dynamically Altering Cache Replacement Algorithm in CPU-GPU Heterogeneous Multi-Core Architecture
    Fang, Juan
    Fan, Qingwen
    Hao, Xiaoting
    Cheng, Yanjin
    Sun, Lijun
    2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 723 - +
  • [22] Adaptive Stochastic Gradient Descent for Deep Learning on Heterogeneous CPU plus GPU Architectures
    Ma, Yujing
    Rusu, Florin
    Wu, Kesheng
    Sim, Alexander
    2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 6 - 15
  • [23] P4GPU: Acceleration of Programmable Data Plane Using a CPU-GPU Heterogeneous Architecture
    Li, Peilong
    Luo, Yan
    2016 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE SWITCHING AND ROUTING (HPSR), 2016, : 168 - 175
  • [24] Hierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU plus GPU Clusters
    Clarke, David
    Ilic, Aleksandar
    Lastovetsky, Alexey
    Sousa, Leonel
    EURO-PAR 2012 PARALLEL PROCESSING, 2012, 7484 : 489 - 501
  • [25] Optimization of Parallel Algorithm for Kalman Filter on CPU-GPU Heterogeneous System
    Xu, Dandan
    Xiao, Zheng
    Li, Dapu
    Wu, Fan
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 2165 - 2172
  • [26] A Bi-objective Optimization Framework for Heterogeneous CPU/GPU Query Plans
    Przymus, Piotr
    Kaczmarski, Krzysztof
    Stencel, Krzysztof
    FUNDAMENTA INFORMATICAE, 2014, 135 (04) : 483 - 501
  • [27] Solving optimization problems using a hybrid systolic search on GPU plus CPU
    Vidal, Pablo
    Alba, Enrique
    Luna, Francisco
    SOFT COMPUTING, 2017, 21 (12) : 3227 - 3245
  • [28] Reducing Inter-Application Interferences in Integrated CPU-GPU Heterogeneous Architecture
    Wen, Hao
    Zhang, Wei
    2018 IEEE 36TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2018, : 278 - 281
  • [29] Implementation and Analysis of the Histograms of Oriented Gradients Algorithm on a Heterogeneous Multicore CPU/GPU Architecture
    Arndt, Oliver Jakob
    Linde, Tobias
    Blume, Holger
    2015 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2015, : 1402 - 1406
  • [30] Solving optimization problems using a hybrid systolic search on GPU plus CPU
    Pablo Vidal
    Enrique Alba
    Francisco Luna
    Soft Computing, 2017, 21 : 3227 - 3245