Understanding Stencil Code Performance On MultiCore Architectures

被引:23
|
作者
Rahman, Shah M. Faizur [1 ]
Yi, Qing [1 ]
Qasem, Apan [2 ]
机构
[1] Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA
[2] Texas State Univ, Dept Comp Sci, San Marcos, TX USA
基金
美国国家科学基金会;
关键词
D O I
10.1145/2016604.2016641
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stencil computations are the foundation of many large applications in scientific computing. Previous research has shown that several optimization mechanisms, including rectangular blocking and time skewing combined with wavefront- and pipeline-based parallelization, can be used to significantly improve the performance of stencil kernels on multi-core architectures. However, the overall performance impact of these optimizations are difficult to predict due to the inter-play of load imbalance, synchronization overhead, and cache locality. This paper presents a detailed performance study of these optimizations by applying them with a wide variety of different configurations, using hardware counters to monitor the efficiency of architectural components, and then developing a set of formulas via regression analysis to model their overall performance impact in terms of the affected hardware counter numbers. We have applied our methodology to three stencil computation kernels, a 7-point jacobi, a 27-point jacobi, and a 7-point Gauss-Seidel computation. Our experimental results show that a precise formula can be developed for each kernel to accurately model the overall performance impact of varying optimizations and thereby effectively guide the performance analysis and tuning of these kernels.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] SPEEDUP RESILIENCE: A PRACTICAL METRIC TO EXPLORE THE PERFORMANCE BOUNDARY OF MULTICORE ARCHITECTURES
    Chu, Slo-Li
    Chen, Shiue-Ru
    2011 INTERNATIONAL CONFERENCE ON MECHANICAL ENGINEERING AND TECHNOLOGY (ICMET 2011), 2011, : 471 - 476
  • [42] Power/performance/thermal design-space exploration for multicore architectures
    Monchiero, Matteo
    Canal, Ramon
    Gonzalez, Antonio
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2008, 19 (05) : 666 - 681
  • [43] Performance improvement and analysis of snoopy cache coherence based multicore architectures
    Joshi, Amit D.
    Ramasubramanian, N.
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2023, 14 (SUPPL 3) : 848 - 864
  • [44] Speedup Resilience: A Practical Metric to Explore the Performance Boundary of Multicore Architectures
    Chu, Slo-Li
    Chen, Shiue-Ru
    2011 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND AUTOMATION (CCCA 2011), VOL I, 2010, : 35 - 40
  • [45] Camellia: a Novel High Performance On-Chip Network for Multicore Architectures
    Chu, Slo-Li
    Shu, Sheng-Jie
    Chen, Ching-Chung
    Chen, Ching-Jung
    2015 11TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2015, : 186 - 191
  • [46] Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures
    Tan, Guangming
    Sun, Ninghui
    Gao, Guang R.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2009, 20 (02) : 261 - 274
  • [47] Performance-steered design of software architectures for embedded multicore systems
    Bechini, A
    Prete, CA
    SOFTWARE-PRACTICE & EXPERIENCE, 2002, 32 (12): : 1155 - 1173
  • [48] Asymmetrically reliable caches for multicore architectures under performance and energy constraints
    Arslan, Sanem
    Topcuoglu, Haluk Rahmi
    Kandemir, Mahmut Taylan
    Tosun, Oguz
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (04): : 1819 - 1833
  • [49] Asymmetrically reliable caches for multicore architectures under performance and energy constraints
    Sanem Arslan
    Haluk Rahmi Topcuoglu
    Mahmut Taylan Kandemir
    Oguz Tosun
    Cluster Computing, 2016, 19 : 1819 - 1833
  • [50] Performance improvement and analysis of snoopy cache coherence based multicore architectures
    Amit D. Joshi
    N. Ramasubramanian
    International Journal of System Assurance Engineering and Management, 2023, 14 : 848 - 864