Landing Stencil Code on Godson-T

被引:0
|
作者
Hui-Min Cui
Lei Wang
Dong-Rui Fan
Xiao-Bing Feng
机构
[1] Chinese Academy of Sciences,Key Laboratory of Computer System and Architecture, Institute of Computing Technology
[2] Graduate University of Chinese Academy of Sciences,undefined
关键词
many-core; stencil; Jacobi; compiler; SPM; fine-grain synchronization;
D O I
暂无
中图分类号
学科分类号
摘要
The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology — together they may have profound impact. This paper presents a case study (using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study: 1) chip-level global addressable memory in particular the scratchpad memories (SPM) local to the processing cores; 2) fine-grain memory based synchronization (e.g., full-empty bit for fine-grain synchronization). Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization (e.g., timed tiling and variants), we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism (full-empty bits) under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.
引用
收藏
页码:886 / 894
页数:8
相关论文
共 50 条
  • [41] SECURITY STUDIES FOR THE 1990S - SHULTZ,R, GODSON,R, GREENWOOD,T
    BALDWIN, DA
    [J]. WORLD POLITICS, 1995, 48 (01) : 117 - 141
  • [42] Thoroughly Exploring GPU Buffering Options for Stencil Code by Using an Efficiency Measure and a Performance Model
    Hu, Yue
    Koppelman, David M.
    Brandt, Steven Robert
    [J]. IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (03): : 477 - 490
  • [43] Modern Code Applied in Stencil in Edge Detection of an Image for Architecture Intel Xeon Phi KNL
    Hernandez-Hernandez, Mario
    Luis Hernandez-Hernandez, Jose
    Rodriguez Maldonado, Edilia
    Herrera Miranda, Israel
    [J]. TECHNOLOGIES AND INNOVATION (CITI 2019), 2019, 1124 : 151 - 163
  • [44] Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations
    Rawat, Prashant Singh
    Vaidya, Miheer
    Sukumaran-Rajam, Aravind
    Ravishankar, Mahesh
    Grover, Vinod
    Rountev, Atanas
    Pouchet, Louis-Noel
    Sadayappan, P.
    [J]. PROCEEDINGS OF THE IEEE, 2018, 106 (11) : 1902 - 1920
  • [45] Simulation of aircraft landing gears with a nonlinear dynamic finite element code
    Lyle, KH
    Jackson, KE
    Fasanella, EL
    [J]. JOURNAL OF AIRCRAFT, 2002, 39 (01): : 142 - 147
  • [46] A Performance Study of an Anelastic Wave Propagation Code Using Auto-tuned Stencil Computations
    Christen, Matthias
    Schenk, Olaf
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 956 - 965
  • [47] Simulation of aircraft landing gears with a nonlinear dynamic finite element code
    [J]. Lyle, K.H, 1600, American Inst. Aeronautics and Astronautics Inc. (39):
  • [48] Testing Legacy Embedded Code: Landing on a Software Engineering Desert Island
    Oriol, Manuel
    [J]. 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST), 2015,
  • [49] Using Code Based GNSS Double Differences as Beacon Landing System
    Dautermann, Thomas
    Korn, Bernd
    de Haag, Maarten Uijt
    [J]. 2017 IEEE/AIAA 36TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2017,
  • [50] A Performance Model and Efficiency-Based Assignment of Buffering Strategies for Automatic GPU Stencil Code Generation
    Hu, Yue
    Koppelman, David M.
    Brandt, Steven R.
    [J]. 2016 IEEE 10TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC), 2016, : 361 - 368