An OpenCL Framework for Heterogeneous Multicores with Local Memory

被引:0
|
作者
Lee, Jaejin [1 ]
Kim, Jungwon [1 ]
Seo, Sangmin [1 ]
Kim, Seungkyun [1 ]
Park, Jungho [1 ]
Kim, Honggyu [1 ]
Thanh Tuan Dao [1 ]
Cho, Yongjin [1 ]
Seo, Sung Jong
Lee, Seung Hak
Cho, Seung Mo
Song, Hyo Jung
Suh, Sang-Bum
Choi, Jong-Deok
机构
[1] Seoul Natl Univ, Sch Comp Sci & Engn, Seoul 151744, South Korea
关键词
OpenCL; Compilers; Runtime; Software-managed caches; Memory consistency; Work-item coalescing; Preload-poststore buffering;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present the design and implementation of an Open Computing Language (OpenCL) framework that targets heterogeneous accelerator multicore architectures with local memory. The architecture consists of a general-purpose processor core and multiple accelerator cores that typically do not have any cache. Each accelerator core, instead, has a small internal local memory. Our OpenCL runtime is based on software-managed caches and coherence protocols that guarantee OpenCL memory consistency to overcome the limited size of the local memory. To boost performance, the runtime relies on three source-code transformation techniques, work-item coalescing, web-based variable expansion and preload-poststore buffering, performed by our OpenCL C source-to-source translator. Work-item coalescing is a procedure to serialize multiple SPMD-like tasks that execute concurrently in the presence of barriers and to sequentially run them on a single accelerator core. It requires the web-based variable expansion technique to allocate local memory for private variables. Preload-poststore buffering is a buffering technique that eliminates the overhead of software cache accesses. Together with work-item coalescing, it has a synergistic effect on boosting performance. We show the effectiveness of our OpenCL framework, evaluating its performance with a system that consists of two Cell BE processors. The experimental result shows that our approach is promising.
引用
收藏
页码:193 / 204
页数:12
相关论文
共 50 条
  • [1] A Runtime Resource Management Policy for OpenCL Workloads on Heterogeneous Multicores
    Angioletti, Daniele
    Bertani, Francesco
    Bolchini, Cristiana
    Cerizzi, Francesco
    Miele, Antonio
    [J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1385 - 1390
  • [2] Portable LDPC Decoding on Multicores Using OpenCL
    Falcao, Gabriel
    Silva, Vitor
    Sousa, Leonel
    Andrade, Joao
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (04) : 81 - +
  • [3] LUSH: Lightweight Framework for User-level Scheduling in Heterogeneous Multicores
    Xu, Vasco Miguel Liang
    McShane, Liam White
    Mosse, Daniel
    [J]. 2021 IEEE 14TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2021), 2021, : 396 - 404
  • [4] Accelerating Local Feature Extraction using OpenCL on Heterogeneous Platforms
    Moren, Konrad
    Perschke, Thomas
    Goehringer, Diana
    [J]. PROCEEDINGS OF THE 2014 CONFERENCE ON DESIGN AND ARCHITECTURES FOR SIGNAL AND IMAGE PROCESSING, 2014,
  • [5] Local Memory Store (LMStr): A Hardware Controlled Shared Scratchpad for Multicores
    Siddique, Nafiul A.
    Badawy, Abdel-Hameed A.
    Cook, Jeanine
    Resnick, David
    [J]. 2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2017,
  • [6] Design and Implementation of Software-Managed Caches for Multicores with Local Memory
    Seo, Sangmin
    Lee, Jaejin
    Sura, Zehra
    [J]. HPCA-15 2009: FIFTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2009, : 55 - +
  • [7] Automatic Local Memory Management for Multicores Having Global Address Space
    Yamamoto, Kouhei
    Shirakawa, Tomoya
    Oki, Yoshitake
    Yoshida, Akimasa
    Kimura, Keiji
    Kasahara, Hironori
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2016, 2017, 10136 : 282 - 296
  • [8] Dynamic Scheduling for Heterogeneous Multicores
    Vazquez R.
    Edun A.
    Gordon-Ross A.
    Stitt G.
    [J]. SN Computer Science, 2021, 2 (6)
  • [9] Dynamic Scheduling on Heterogeneous Multicores
    Edun, Ayobami
    Vazquez, Ruben
    Gordon-Ross, Ann
    Stitt, Greg
    [J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1685 - 1690
  • [10] An OpenCL-based Framework for Rapid Virtual Prototyping of Heterogeneous Architectures
    Sotiriou-Xanthopoulos, Efstathios
    Masing, Leonard
    Siozios, Kostas
    Economakos, George
    Soudris, Dimitrios
    Becker, Juergen
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION (SAMOS), 2016, : 372 - 377