Mapping Streaming Applications on Commodity Multi-CPU and GPU On-Chip Processors

被引:15
|
作者
Vilches, Antonio [1 ]
Navarro, Angeles [1 ]
Asenjo, Rafael [1 ]
Corbera, Francisco [1 ]
Gran, Ruben [2 ]
Garzaran, Maria J. [3 ]
机构
[1] Univ Malaga, E-29071 Malaga, Spain
[2] Univ Zaragoza, E-50009 Zaragoza, Spain
[3] UIUC, Dept Comp Sci, Urbana, IL USA
关键词
Heterogeneous CPU-GPU chips; pipeline pattern; adaptive mapping; analytical model; energy aware;
D O I
10.1109/TPDS.2015.2432809
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we consider the problem of efficiently executing streaming applications on commodity processors composed of several cores and an on-chip GPU. Streaming applications, such as those in vision and video analytic, consist of a pipeline of stages and are good candidates to take advantage of this type of platforms. We also consider that characteristics of the input may change while the application is running. Therefore, we propose a framework that adaptively finds the optimal mapping of the pipeline stages. The core of the framework is an analytical model coupled with information collected at runtime used to dynamically map each pipeline stage to the most efficient device, taking into consideration both performance and energy. Our experimental results show that for the evaluated applications running on two different architectures, our model always predicts the best configuration among the evaluated alternatives, and significantly reduces the amount of information that needs to be collected at runtime. This best configuration has, on the average, 20 percent higher throughput than the configuration recommended by a baseline state of the art approach, while the ratio throughput/energy is 43 percent higher. We have measured improvements in throughput and throughput/energy of up-to 81 and 204 percent, respectively, when the model is used to adapt to a video that changes from low to high definition.
引用
收藏
页码:1099 / 1115
页数:17
相关论文
共 50 条
  • [31] Power-Aware Characterization and Mapping of Workloads on CPU-GPU Processors
    Dev, Kapil
    Zhan, Xin
    Reda, Sherief
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, 2016, : 225 - 226
  • [32] Determining a Device Crossover Point in CPU/GPU Systems for Streaming Applications
    Kanur, Sudeep
    Lund, Wictor
    Tsiopoulos, Leonidas
    Lilius, Johan
    2015 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2015, : 1417 - 1421
  • [33] A Novel Multi-CPU/GPU Collaborative Computing Framework for SGD-based Matrix Factorization
    Huang, Yizhi
    Yin, Yanlong
    Liu, Yan
    He, Shuibing
    Bai, Yang
    Li, Renfa
    50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
  • [34] Accelerating hyper-spectral data processing on the multi-CPU and multi-GPU heterogeneous computing platform
    Zhang, Lei
    Gao, Jiao Bo
    Hu, Yu
    Wang, Ying Hui
    Sun, Ke Feng
    Cheng, Juan
    Sun, Dan
    Li, Yu
    SECOND INTERNATIONAL CONFERENCE ON PHOTONICS AND OPTICAL ENGINEERING, 2017, 10256
  • [35] Using MATLAB's Parallel Processing Toolbox for Multi-CPU and Multi-GPU Accelerated FDTD Simulations
    Weiss, Alec J.
    Elsherbeni, Atef Z.
    Demir, Veysel
    Hadi, Mohammed F.
    APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2019, 34 (05): : 724 - 730
  • [36] Rate analysis for streaming applications with on-chip buffer constraints
    Maxiaguine, A
    Künzli, S
    Chakraborty, S
    Thiele, L
    ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, 2004, : 131 - 136
  • [37] Accelerating the task/data-parallel version of ILUPACK's BiCG in multi-CPU/GPU configurations
    Aliaga, Jose, I
    Dufrechou, Ernesto
    Ezzatti, Pablo
    Quintana-Orti, Enrique S.
    PARALLEL COMPUTING, 2019, 85 : 79 - 87
  • [38] Design space exploration of on-chip ring interconnection for a CPU-GPU heterogeneous architecture
    Lee, Jaekyu
    Li, Si
    Kim, Hyesoon
    Yalamanchili, Sudhakar
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (12) : 1525 - 1538
  • [39] QoS-supported On-chip Communication for Multi-processors
    Mohammad Abdullah Al Faruque
    Jörg Henkel
    International Journal of Parallel Programming, 2008, 36 : 114 - 139
  • [40] QoS-supported on-chip communication for multi-processors
    Al Faruque, Mohammad Abdullah
    Henkel, Joerg
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2008, 36 (01) : 114 - 139