Landing Stencil Code on Godson-T

被引:0
|
作者
Hui-Min Cui
Lei Wang
Dong-Rui Fan
Xiao-Bing Feng
机构
[1] Chinese Academy of Sciences,Key Laboratory of Computer System and Architecture, Institute of Computing Technology
[2] Graduate University of Chinese Academy of Sciences,undefined
关键词
many-core; stencil; Jacobi; compiler; SPM; fine-grain synchronization;
D O I
暂无
中图分类号
学科分类号
摘要
The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology — together they may have profound impact. This paper presents a case study (using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study: 1) chip-level global addressable memory in particular the scratchpad memories (SPM) local to the processing cores; 2) fine-grain memory based synchronization (e.g., full-empty bit for fine-grain synchronization). Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization (e.g., timed tiling and variants), we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism (full-empty bits) under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.
引用
收藏
页码:886 / 894
页数:8
相关论文
共 50 条
  • [1] Landing Stencil Code on Godson-T
    崔慧敏
    王蕾
    范东睿
    冯晓兵
    [J]. Journal of Computer Science & Technology, 2010, 25 (04) : 886 - 894
  • [2] Landing stencil code on godson-T
    Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
    不详
    [J]. J Comput Sci Technol, 4 (886-894):
  • [3] Landing Stencil Code on Godson-T
    Cui, Hui-Min
    Wang, Lei
    Fan, Dong-Rui
    Feng, Xiao-Bing
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (04) : 886 - 894
  • [4] An optimization of broadcast on godson-T many-core system architecture
    Bao, Ergude
    Li, Weisheng
    Fan, Dongrui
    Yang, Yang
    Ma, Xiaoyu
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (03): : 524 - 531
  • [5] Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions
    Dong-Rui Fan
    Nan Yuan
    Jun-Chao Zhang
    Yong-Bin Zhou
    Wei Lin
    Feng-Long Song
    Xiao-Chun Ye
    He Huang
    Lei Yu
    Guo-Ping Long
    Hao Zhang
    Lei Liu
    [J]. Journal of Computer Science and Technology, 2009, 24 : 1061 - 1073
  • [6] Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions
    Fan, Dong-Rui
    Yuan, Nan
    Zhang, Jun-Chao
    Zhou, Yong-Bin
    Lin, Wei
    Song, Feng-Long
    Ye, Xiao-Chun
    Huang, He
    Yu, Lei
    Long, Guo-Ping
    Zhang, Hao
    Liu, Lei
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2009, 24 (06) : 1061 - 1073
  • [7] Godson-T:An Efficient Many-Core Architecture for Parallel Program Executions
    范东睿
    袁楠
    张军超
    周永彬
    林伟
    宋风龙
    叶笑春
    黄河
    余磊
    龙国平
    张浩
    刘磊
    [J]. Journal of Computer Science & Technology, 2009, 24 (06) : 1061 - 1073
  • [8] Scalability study of molecular dynamics simulation on Godson-T many-core architecture
    Peng, Liu
    Tan, Guangming
    Kalia, Rajiv K.
    Nakano, Aiichiro
    Vashishta, Priya
    Fan, Dongrui
    Zhang, Hao
    Song, Fenglong
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (11) : 1469 - 1482
  • [9] Preliminary Investigation of Accelerating Molecular Dynamics Simulation on Godson-T Many-Core Processor
    Peng, Liu
    Tan, Guangming
    Kalia, Rajiv K.
    Nakano, Aiichiro
    Vashishta, Priya
    Fang, Dongrui
    Sun, Ninghui
    [J]. EURO-PAR 2010 PARALLEL PROCESSING WORKSHOPS, 2011, 6586 : 349 - 356
  • [10] Godson-T众核体系结构上的Broadcast性能优化
    包尔固德
    李伟生
    范东睿
    杨扬
    马啸宇
    [J]. 计算机研究与发展, 2010, 47 (03) : 524 - 531