Landing Stencil Code on Godson-T

被引:1
|
作者
崔慧敏 [1 ,2 ]
王蕾 [1 ,2 ]
范东睿 [1 ]
冯晓兵 [1 ]
机构
[1] Key Laboratory of Computer System and Architecture,Institute of Computing Technology,Chinese Academy of Sciences
[2] Graduate University of Chinese Academy of Sciences
基金
中国国家自然科学基金;
关键词
many-core; stencil; Jacobi; compiler; SPM; fine-grain synchronization;
D O I
暂无
中图分类号
TP332 [运算器和控制器(CPU)];
学科分类号
081201 ;
摘要
The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge.In particular,computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology—together they may have profound impact.This paper presents a case study(using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T.Godson-T architecture has several unique features that are chosen for this study:1) chip-level global addressable memory in particular the scratchpad memories(SPM) local to the processing cores;2) fine-grain memory based synchronization(e.g.,full-empty bit for fine-grain synchronization).Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization(e.g.,timed tiling and variants),we developed and implement a number of many-core-based optimization for Godson-T.Our experimental study shows good performance in both execution time speedup and scalability,validate the value of globally accessed SPM and fine-grain synchronization mechanism(full-empty bits) under the Godson-T,and provides some useful guidelines for future compiler technology of many-core chip architectures.
引用
收藏
页码:886 / 894
页数:9
相关论文
共 50 条
  • [1] Landing stencil code on godson-T
    Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
    不详
    [J]. J Comput Sci Technol, 4 (886-894):
  • [2] Landing Stencil Code on Godson-T
    Hui-Min Cui
    Lei Wang
    Dong-Rui Fan
    Xiao-Bing Feng
    [J]. Journal of Computer Science and Technology, 2010, 25 : 886 - 894
  • [3] Landing Stencil Code on Godson-T
    Cui, Hui-Min
    Wang, Lei
    Fan, Dong-Rui
    Feng, Xiao-Bing
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (04) : 886 - 894
  • [4] An optimization of broadcast on godson-T many-core system architecture
    Bao, Ergude
    Li, Weisheng
    Fan, Dongrui
    Yang, Yang
    Ma, Xiaoyu
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (03): : 524 - 531
  • [5] Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions
    Dong-Rui Fan
    Nan Yuan
    Jun-Chao Zhang
    Yong-Bin Zhou
    Wei Lin
    Feng-Long Song
    Xiao-Chun Ye
    He Huang
    Lei Yu
    Guo-Ping Long
    Hao Zhang
    Lei Liu
    [J]. Journal of Computer Science and Technology, 2009, 24 : 1061 - 1073
  • [6] Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions
    Fan, Dong-Rui
    Yuan, Nan
    Zhang, Jun-Chao
    Zhou, Yong-Bin
    Lin, Wei
    Song, Feng-Long
    Ye, Xiao-Chun
    Huang, He
    Yu, Lei
    Long, Guo-Ping
    Zhang, Hao
    Liu, Lei
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2009, 24 (06) : 1061 - 1073
  • [7] Godson-T:An Efficient Many-Core Architecture for Parallel Program Executions
    范东睿
    袁楠
    张军超
    周永彬
    林伟
    宋风龙
    叶笑春
    黄河
    余磊
    龙国平
    张浩
    刘磊
    [J]. Journal of Computer Science & Technology, 2009, 24 (06) : 1061 - 1073
  • [8] Scalability study of molecular dynamics simulation on Godson-T many-core architecture
    Peng, Liu
    Tan, Guangming
    Kalia, Rajiv K.
    Nakano, Aiichiro
    Vashishta, Priya
    Fan, Dongrui
    Zhang, Hao
    Song, Fenglong
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (11) : 1469 - 1482
  • [9] Preliminary Investigation of Accelerating Molecular Dynamics Simulation on Godson-T Many-Core Processor
    Peng, Liu
    Tan, Guangming
    Kalia, Rajiv K.
    Nakano, Aiichiro
    Vashishta, Priya
    Fang, Dongrui
    Sun, Ninghui
    [J]. EURO-PAR 2010 PARALLEL PROCESSING WORKSHOPS, 2011, 6586 : 349 - 356
  • [10] Godson-T众核体系结构上的Broadcast性能优化
    包尔固德
    李伟生
    范东睿
    杨扬
    马啸宇
    [J]. 计算机研究与发展, 2010, 47 (03) : 524 - 531