Mapping Streaming Applications on Commodity Multi-CPU and GPU On-Chip Processors

被引:15
|
作者
Vilches, Antonio [1 ]
Navarro, Angeles [1 ]
Asenjo, Rafael [1 ]
Corbera, Francisco [1 ]
Gran, Ruben [2 ]
Garzaran, Maria J. [3 ]
机构
[1] Univ Malaga, E-29071 Malaga, Spain
[2] Univ Zaragoza, E-50009 Zaragoza, Spain
[3] UIUC, Dept Comp Sci, Urbana, IL USA
关键词
Heterogeneous CPU-GPU chips; pipeline pattern; adaptive mapping; analytical model; energy aware;
D O I
10.1109/TPDS.2015.2432809
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we consider the problem of efficiently executing streaming applications on commodity processors composed of several cores and an on-chip GPU. Streaming applications, such as those in vision and video analytic, consist of a pipeline of stages and are good candidates to take advantage of this type of platforms. We also consider that characteristics of the input may change while the application is running. Therefore, we propose a framework that adaptively finds the optimal mapping of the pipeline stages. The core of the framework is an analytical model coupled with information collected at runtime used to dynamically map each pipeline stage to the most efficient device, taking into consideration both performance and energy. Our experimental results show that for the evaluated applications running on two different architectures, our model always predicts the best configuration among the evaluated alternatives, and significantly reduces the amount of information that needs to be collected at runtime. This best configuration has, on the average, 20 percent higher throughput than the configuration recommended by a baseline state of the art approach, while the ratio throughput/energy is 43 percent higher. We have measured improvements in throughput and throughput/energy of up-to 81 and 204 percent, respectively, when the model is used to adapt to a video that changes from low to high definition.
引用
收藏
页码:1099 / 1115
页数:17
相关论文
共 50 条
  • [41] Using Criticality of GPU Accesses in Memory Management for CPU-GPU Heterogeneous Multi-Core Processors
    Rai, Siddharth
    Chaudhuri, Mainak
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2017, 16
  • [42] Dynamic and thermodynamic crossover scenarios in the Kob-Andersen mixture: Insights from multi-CPU and multi-GPU simulations
    Coslovich, Daniele
    Ozawa, Misaki
    Kob, Walter
    EUROPEAN PHYSICAL JOURNAL E, 2018, 41 (05):
  • [43] Solvated and generalised Born calculations differences using GPU CUDA and multi-CPU simulations of an antifreeze protein with AMBER
    Peramo, Antonio
    MOLECULAR SIMULATION, 2016, 42 (15) : 1263 - 1273
  • [44] Dynamic and thermodynamic crossover scenarios in the Kob-Andersen mixture: Insights from multi-CPU and multi-GPU simulations
    Daniele Coslovich
    Misaki Ozawa
    Walter Kob
    The European Physical Journal E, 2018, 41
  • [45] Exploring data flow design and vectorization with oneAPI for streaming applications on CPU plus GPU
    Campos, Cristian
    Asenjo, Rafael
    Navarro, Angeles
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (02):
  • [46] Scenario-Based Design Flow for Mapping Streaming Applications onto On-Chip Many-Core Systems
    Schor, Lars
    Bacivarov, Iuliana
    Rai, Devendra
    Yang, Hoeseok
    Kang, Shin-Haeng
    Thiele, Lothar
    CASES'12: PROCEEDINGS OF THE 2012 ACM INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS, 2012, : 71 - 80
  • [47] Performance and Power Consumption Investigation for Execution of Integer Operations on CPU and GPU Processors for Multimedia Applications
    Iovanovici, A.
    Visan, C.
    Marcu, M.
    2009 7TH INTERNATIONAL SYMPOSIUM ON INTELLIGENT SYSTEMS AND INFORMATICS, 2009, : 258 - 262
  • [48] UMA-MF: A Unified Multi-CPU/GPU Asynchronous Computing Framework for SGD-Based Matrix Factorization
    Huang, Yizhi
    Liu, Yan
    Bai, Yang
    Chen, Si
    Li, Renfa
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (11) : 2978 - 2993
  • [49] Exploiting On-Chip Routers to Store Dirty Cache Blocks in Tiled Chip Multi-Processors
    Das, Abhijit
    Kumar, Abhishek
    Jose, John
    Palesi, Maurizio
    2020 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2020), 2020, : 147 - 152
  • [50] Static cache partitioning robustness analysis for embedded on-chip multi-processors
    Molnos, Anca M.
    Cotofana, Sorin D.
    Heijligers, Marc J. M.
    van Eijndhoven, Jos T. J.
    TRANSACTIONS ON HIGH-PERFORMANCE EMBEDDED ARCHITECTURES AND COMPILERS I, 2007, 4050 : 279 - +