Landing Stencil Code on Godson-T

被引:0
|
作者
Hui-Min Cui
Lei Wang
Dong-Rui Fan
Xiao-Bing Feng
机构
[1] Chinese Academy of Sciences,Key Laboratory of Computer System and Architecture, Institute of Computing Technology
[2] Graduate University of Chinese Academy of Sciences,undefined
关键词
many-core; stencil; Jacobi; compiler; SPM; fine-grain synchronization;
D O I
暂无
中图分类号
学科分类号
摘要
The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology — together they may have profound impact. This paper presents a case study (using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study: 1) chip-level global addressable memory in particular the scratchpad memories (SPM) local to the processing cores; 2) fine-grain memory based synchronization (e.g., full-empty bit for fine-grain synchronization). Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization (e.g., timed tiling and variants), we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism (full-empty bits) under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.
引用
收藏
页码:886 / 894
页数:8
相关论文
共 50 条
  • [21] Optimizing Stencil Code via Locality of Computation
    Luo, Yulong
    Tan, Guangming
    [J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 477 - 478
  • [22] ExaStencils: Advanced Stencil-Code Engineering
    Lengauer, Christian
    Apel, Sven
    Bolten, Matthias
    Groesslinger, Armin
    Hannig, Frank
    Koestler, Harald
    Ruede, Ulrich
    Teich, Juergen
    Grebhahn, Alexander
    Kronawitter, Stefan
    Kuckuk, Sebastian
    Rittich, Hannah
    Schmitt, Christian
    [J]. EURO-PAR 2014: PARALLEL PROCESSING WORKSHOPS, PT II, 2014, 8806 : 553 - 564
  • [23] Special issue: Advanced stencil-code engineering
    Lengauer, Christian
    Bolten, Matthias
    Falgout, Robert
    Schenk, Olaf
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (17):
  • [24] Locality-aware scheduling for stencil code in Halide
    Liao, Shih-wei
    Tsai, Sheng-Jun
    Yang, Chieh-Hsun
    Lo, Chen-Kang
    [J]. PROCEEDINGS OF 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2016), 2016, : 72 - 77
  • [25] Scalable GPU Communication with Code Generation on Stencil Applications
    Tozatti Risso, Joao Victor
    Bauer, Martin
    de Carvalho, Paulo Roberto, Jr.
    Ruede, Ulrich
    Weingaertner, Daniel
    [J]. 2019 31ST INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2019), 2019, : 88 - 95
  • [26] Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs
    Wen-Jing Ma
    Kan Gao
    Guo-Ping Long
    [J]. Journal of Computer Science and Technology, 2016, 31 : 1262 - 1274
  • [27] Extending OpenACC for Efficient Stencil Code Generation and Execution by Skeleton Frameworks
    Pereira, Alyson D.
    Castro, Marcio
    Dantas, Mario A. R.
    Rocha, Rodrigo C. O.
    Goes, Luis F. W.
    [J]. 2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 719 - 726
  • [28] FOURST: A code generator for FFT-based fast stencil computations
    Ahmad, Zafar
    Javanmard, Mohammad Mahdi
    Croisdale, Gregory
    Gregory, Aaron
    Ganapathi, Pramod
    Pouchet, Louis-Noel
    Chowdhury, Rezaul
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2022), 2022, : 99 - 108
  • [29] Automatic Code Generation for Iterative Multi-dimensional Stencil Computations
    Saied, Mariem
    Gustedt, Jens
    Muller, Gilles
    [J]. PROCEEDINGS OF 2016 IEEE 23RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2016, : 280 - 289
  • [30] Massively Parallel Stencil Code Solver with Autonomous Adaptive Block Distribution
    Berghoff, Marco
    Kondov, Ivan
    Hoetzer, Johannes
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (10) : 2282 - 2296