An OpenCL Framework for Heterogeneous Multicores with Local Memory

被引:0
|
作者
Lee, Jaejin [1 ]
Kim, Jungwon [1 ]
Seo, Sangmin [1 ]
Kim, Seungkyun [1 ]
Park, Jungho [1 ]
Kim, Honggyu [1 ]
Thanh Tuan Dao [1 ]
Cho, Yongjin [1 ]
Seo, Sung Jong
Lee, Seung Hak
Cho, Seung Mo
Song, Hyo Jung
Suh, Sang-Bum
Choi, Jong-Deok
机构
[1] Seoul Natl Univ, Sch Comp Sci & Engn, Seoul 151744, South Korea
来源
PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES | 2010年
关键词
OpenCL; Compilers; Runtime; Software-managed caches; Memory consistency; Work-item coalescing; Preload-poststore buffering;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present the design and implementation of an Open Computing Language (OpenCL) framework that targets heterogeneous accelerator multicore architectures with local memory. The architecture consists of a general-purpose processor core and multiple accelerator cores that typically do not have any cache. Each accelerator core, instead, has a small internal local memory. Our OpenCL runtime is based on software-managed caches and coherence protocols that guarantee OpenCL memory consistency to overcome the limited size of the local memory. To boost performance, the runtime relies on three source-code transformation techniques, work-item coalescing, web-based variable expansion and preload-poststore buffering, performed by our OpenCL C source-to-source translator. Work-item coalescing is a procedure to serialize multiple SPMD-like tasks that execute concurrently in the presence of barriers and to sequentially run them on a single accelerator core. It requires the web-based variable expansion technique to allocate local memory for private variables. Preload-poststore buffering is a buffering technique that eliminates the overhead of software cache accesses. Together with work-item coalescing, it has a synergistic effect on boosting performance. We show the effectiveness of our OpenCL framework, evaluating its performance with a system that consists of two Cell BE processors. The experimental result shows that our approach is promising.
引用
收藏
页码:193 / 204
页数:12
相关论文
共 50 条
  • [1] A Framework for OpenCL Task Scheduling on Heterogeneous Multicores
    Ghose A.
    Dokara L.
    Dey S.
    Mitra P.
    1600, World Scientific (27): : 3 - 4
  • [2] A Runtime Resource Management Policy for OpenCL Workloads on Heterogeneous Multicores
    Angioletti, Daniele
    Bertani, Francesco
    Bolchini, Cristiana
    Cerizzi, Francesco
    Miele, Antonio
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1385 - 1390
  • [3] Portable LDPC Decoding on Multicores Using OpenCL
    Falcao, Gabriel
    Silva, Vitor
    Sousa, Leonel
    Andrade, Joao
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (04) : 81 - +
  • [4] LUSH: Lightweight Framework for User-level Scheduling in Heterogeneous Multicores
    Xu, Vasco Miguel Liang
    McShane, Liam White
    Mosse, Daniel
    2021 IEEE 14TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2021), 2021, : 396 - 404
  • [5] Accelerating Local Feature Extraction using OpenCL on Heterogeneous Platforms
    Moren, Konrad
    Perschke, Thomas
    Goehringer, Diana
    PROCEEDINGS OF THE 2014 CONFERENCE ON DESIGN AND ARCHITECTURES FOR SIGNAL AND IMAGE PROCESSING, 2014,
  • [6] Design and Implementation of Software-Managed Caches for Multicores with Local Memory
    Seo, Sangmin
    Lee, Jaejin
    Sura, Zehra
    HPCA-15 2009: FIFTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2009, : 55 - +
  • [7] Local Memory Store (LMStr): A Hardware Controlled Shared Scratchpad for Multicores
    Siddique, Nafiul A.
    Badawy, Abdel-Hameed A.
    Cook, Jeanine
    Resnick, David
    2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2017,
  • [8] Automatic Local Memory Management for Multicores Having Global Address Space
    Yamamoto, Kouhei
    Shirakawa, Tomoya
    Oki, Yoshitake
    Yoshida, Akimasa
    Kimura, Keiji
    Kasahara, Hironori
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2016, 2017, 10136 : 282 - 296
  • [9] Dynamic Scheduling for Heterogeneous Multicores
    Vazquez R.
    Edun A.
    Gordon-Ross A.
    Stitt G.
    SN Computer Science, 2021, 2 (6)
  • [10] Dynamic Scheduling on Heterogeneous Multicores
    Edun, Ayobami
    Vazquez, Ruben
    Gordon-Ross, Ann
    Stitt, Greg
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1685 - 1690