Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors

被引:9
|
作者
Li, Mingzhen [1 ,2 ]
Liu, Yi [2 ]
Yang, Hailong [1 ,2 ]
Hu, Yongmin [2 ]
Sun, Qingxiao [2 ]
Chen, Bangduo [2 ]
You, Xin [2 ]
Liu, Xiaoyan [2 ]
Luan, Zhongzhi [2 ]
Qian, Depei [2 ]
机构
[1] State Key Lab Software Dev Environm, Beijing, Peoples R China
[2] Beihang Univ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Stencil; Domain Specific Language; Performance Optimization; Manycore Architecture;
D O I
10.1145/3472456.3473517
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stencil computation is an indispensable building block of many scientific applications and is widely used by the numerical solvers of partial differential equations (PDEs). Due to the complex computation patterns of different stencils and the various hardware targets (e.g., many-core processors), many domain-specific languages (DSLs) have been proposed to optimize stencil computation. However, existing stencil DSLs mostly focus on the performance optimizations on homogeneous many-core processors such as CPUs and GPUs, and fail to embrace emerging heterogeneous many-core processors such as Sunway. In addition, few of them can support expressing stencil with multiple time dependencies and optimizations from both spatial and temporal dimensions. Moreover, most stencil DSLs are unable to generate codes that can run efficiently in large scale, which limits their practical applicability. In this paper, we propose MSC, a new stencil DSL designed to express stencil computation in both spatial and temporal dimensions. It can generate high-performance stencil codes for large-scale execution on emerging many-core processors. Specially, we design several optimization primitives for improving parallelism and data locality, and a communication library for efficient halo exchange in large scale execution. The experiment results show that our MSC achieves better performance compared to the state-of-the-art stencil DSLs.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A PGAS Execution Model for Efficient Stencil Computation on Many-Core Processors
    Ikei, Mitsuru
    Sato, Mitsuhisa
    [J]. 2014 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2014, : 305 - 314
  • [2] Multi-level spatial and temporal tiling for efficient HPC stencil computation on many-core processors with large shared caches
    Yount, Charles
    Duran, Alejandro
    Tobin, Josh
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 92 : 903 - 919
  • [3] Large-Scale Automatic K-Means Clustering for Heterogeneous Many-Core Supercomputer
    Yu, Teng
    Zhao, Wenlai
    Liu, Pan
    Janjic, Vladimir
    Yan, Xiaohan
    Wang, Shicai
    Fu, Haohuan
    Yang, Guangwen
    Thomson, John
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (05) : 997 - 1008
  • [4] Scaling and optimizing the Gysela code on a cluster of many-core processors
    Latu, Guillaume
    Asahi, Yuuichi
    Bigot, Julien
    Feher, Tamas
    Grandgirard, Virginie
    [J]. 2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 466 - 473
  • [5] A Study of Euclidean Distance Matrix Computation on Intel Many-Core Processors
    Rechkalov, Timofey
    Zymbler, Mikhail
    [J]. PARALLEL COMPUTATIONAL TECHNOLOGIES, PCT 2018, 2018, 910 : 200 - 215
  • [6] Parallelization and fault-tolerance of evolutionary computation on many-core processors
    Sato, Yuji
    Sato, Mikiko
    [J]. 2013 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2013, : 2602 - 2609
  • [7] Optimization of Scan Algorithms on Multi- and Many-core Processors
    Sun, Qiao
    Yang, Chao
    [J]. 2014 21ST INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2014,
  • [8] Optimization of the Load Balancing Policy for Tiled Many-Core Processors
    Liu, Ye
    Kato, Shinpei
    Edahiro, Masato
    [J]. IEEE ACCESS, 2019, 7 : 10176 - 10188
  • [9] Enhancing Performance of Large-scale Electronic Structure Calculations with Many-core Computing
    Ryu, Hoon
    Jeong, Yosang
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 142 - 143
  • [10] Large-Scale Molecular Dynamics Simulation Based on Heterogeneous Many-Core Architecture
    Zhou, Xu
    Wei, Zhiqiang
    Lu, Hao
    He, Jiaqi
    Gao, Yuan
    Hu, Xiaotong
    Wang, Cunji
    Dong, Yujie
    Liu, Hao
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (03) : 851 - 861