Scalability study of molecular dynamics simulation on Godson-T many-core architecture

被引:3
|
作者
Peng, Liu [1 ]
Tan, Guangming [2 ]
Kalia, Rajiv K. [1 ]
Nakano, Aiichiro [1 ]
Vashishta, Priya [1 ]
Fan, Dongrui [2 ]
Zhang, Hao [2 ]
Song, Fenglong [2 ]
机构
[1] Univ So Calif, Collaboratory Adv Comp & Simulat, Los Angeles, CA 90089 USA
[2] Chinese Acad Sci, Inst Comp Technol, Key Lab Comp Syst & Architecture, Beijing 100190, Peoples R China
关键词
Molecular dynamics; Many-core architecture; Scalability;
D O I
10.1016/j.jpdc.2012.07.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Molecular dynamics (MD) simulation has broad applications, and an increasing amount of computing power is needed to satisfy the large scale of the real world simulation. The advent of the many-core paradigm brings unprecedented computing power, but it remains a great challenge to harvest the computing power due to MD's irregular memory-access pattern. To address this challenge, this paper presents a joint application/architecture study to enhance the scalability of MD on Godson-T-like many-core architecture. First, a preprocessing approach leveraging an adaptive divide-and-conquer framework is designed to exploit locality through memory hierarchy with software controlled memory. Then three incremental optimization strategies - a novel data-layout to improve data locality, an on-chip locality-aware parallel algorithm to enhance data reuse, and a pipelining algorithm to hide latency to shared memory - are proposed to enhance on-chip parallelism for Godson-T many-core processor. Experiments on Godson-T simulator exhibit strong-scaling parallel efficiency of 0.99 on 64 cores, which is confirmed by a field-programmable gate array emulator. Also the performance per watt of MD on Godson-T is much higher than MD on a 16-cores Intel core i7 symmetric multiprocessor (SMP) and 26 times higher than MD on an 8-core 64-thread Sun T2 processor. Detailed analysis shows that optimizations utilizing architectural features to maximize data locality and to enhance data reuse benefit scalability most. Furthermore, a hierarchical parallelization scheme is designed to map the MD algorithm to Godson-T many-core cluster and a simple performance model is derived, which suggests that the optimization scheme is likely to scale well toward exascale. Certain architectural features are found essential for these optimizations, which could guide future hardware developments. Published by Elsevier Inc.
引用
收藏
页码:1469 / 1482
页数:14
相关论文
共 50 条
  • [1] Preliminary Investigation of Accelerating Molecular Dynamics Simulation on Godson-T Many-Core Processor
    Peng, Liu
    Tan, Guangming
    Kalia, Rajiv K.
    Nakano, Aiichiro
    Vashishta, Priya
    Fang, Dongrui
    Sun, Ninghui
    [J]. EURO-PAR 2010 PARALLEL PROCESSING WORKSHOPS, 2011, 6586 : 349 - 356
  • [2] Performance Analysis and Optimization of Molecular Dynamics Simulation on Godson-T Many-core Processor
    Peng, Liu
    Nakano, Aiichiro
    Tan, Guangming
    Vashishta, Priya
    Fan, Dongrui
    Zhang, Hao
    Kalia, Rajiv K.
    Song, Fenglong
    [J]. PROCEEDINGS OF THE 2011 8TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF 2011), 2011,
  • [3] An optimization of broadcast on godson-T many-core system architecture
    Bao, Ergude
    Li, Weisheng
    Fan, Dongrui
    Yang, Yang
    Ma, Xiaoyu
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (03): : 524 - 531
  • [4] Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions
    Dong-Rui Fan
    Nan Yuan
    Jun-Chao Zhang
    Yong-Bin Zhou
    Wei Lin
    Feng-Long Song
    Xiao-Chun Ye
    He Huang
    Lei Yu
    Guo-Ping Long
    Hao Zhang
    Lei Liu
    [J]. Journal of Computer Science and Technology, 2009, 24 : 1061 - 1073
  • [5] Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions
    Fan, Dong-Rui
    Yuan, Nan
    Zhang, Jun-Chao
    Zhou, Yong-Bin
    Lin, Wei
    Song, Feng-Long
    Ye, Xiao-Chun
    Huang, He
    Yu, Lei
    Long, Guo-Ping
    Zhang, Hao
    Liu, Lei
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2009, 24 (06) : 1061 - 1073
  • [6] Godson-T:An Efficient Many-Core Architecture for Parallel Program Executions
    范东睿
    袁楠
    张军超
    周永彬
    林伟
    宋风龙
    叶笑春
    黄河
    余磊
    龙国平
    张浩
    刘磊
    [J]. Journal of Computer Science & Technology, 2009, 24 (06) : 1061 - 1073
  • [7] GODSON-T: AN EFFICIENT MANY-CORE PROCESSOR EXPLORING THREAD-LEVEL PARALLELISM
    Fan, Dongrui
    Zhang, Hao
    Wang, Da
    Ye, Xiaochun
    Song, Fenglong
    Li, Guojie
    Sun, Ninghui
    [J]. IEEE MICRO, 2012, 32 (02) : 38 - 47
  • [8] Large-Scale Molecular Dynamics Simulation Based on Heterogeneous Many-Core Architecture
    Zhou, Xu
    Wei, Zhiqiang
    Lu, Hao
    He, Jiaqi
    Gao, Yuan
    Hu, Xiaotong
    Wang, Cunji
    Dong, Yujie
    Liu, Hao
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (03) : 851 - 861
  • [9] Modeling and Simulation of a Many-Core Architecture Using SystemC
    Silva, Ana Rita
    Jose, Wilson
    Neto, Horacio
    Vestias, Mario
    [J]. CONFERENCE ON ELECTRONICS, TELECOMMUNICATIONS AND COMPUTERS - CETC 2013, 2014, 17 : 146 - 153
  • [10] Parallel simulation of many-core processor and many-core clusters
    Lü, Huiwei
    Cheng, Yuan
    Bai, Lu
    Chen, Mingyu
    Fan, Dongrui
    Sun, Ninghui
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2013, 50 (05): : 1110 - 1117