Complexities of Performance Prediction for Bandwidth-Limited Loop Kernels on Multi-Core Architectures

被引:2
|
作者
Treibig, Jan [1 ]
Hager, Georg [1 ]
Wellein, Gerhard [1 ]
机构
[1] Univ Erlangen Nurnberg, Reg Rechenzentrum Erlangen, D-91058 Erlangen, Germany
关键词
D O I
10.1007/978-3-642-13872-0_1
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The balance metric is a simple approach to estimate the performance of bandwidth-limited loop kernels. However, applying the method to modern multicore architectures yields unsatisfactory results. This paper analyzes the influence of cache hierarchy design on performance predictions for bandwidth-limited loop kernels on current mainstream processors. We present a diagnostic model with improved predictive power, correcting the limitations of the simple balance metric. The importance of code execution overhead even in bandwidth-bound situations is emphasized.
引用
收藏
页码:3 / 12
页数:10
相关论文
共 50 条
  • [1] Introducing a Performance Model for Bandwidth-Limited Loop Kernels
    Treibig, Jan
    Hager, Georg
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I, 2010, 6067 : 615 - 624
  • [2] Branch Prediction Migration for Multi-core Architectures
    Zhang, Tan
    Zhou, Chaobing
    Huang, Libo
    Xiao, Nong
    [J]. 2017 INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE, AND STORAGE (NAS), 2017, : 282 - 283
  • [3] High Performance Global Illumination on Multi-core Architectures
    Padron, Emilio J.
    Amor, Margarita
    Doallo, Ramon
    Boo, Montserrat
    [J]. PROCEEDINGS OF THE PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2009, : 93 - +
  • [4] Performance issues in emerging homogeneous multi-core architectures
    Kayi, Abdullah
    El-Ghazawi, Tarek
    Newby, Gregory B.
    [J]. SIMULATION MODELLING PRACTICE AND THEORY, 2009, 17 (09) : 1485 - 1499
  • [5] Interconnection Network Performance of Multi-core Cluster Architectures
    Hamid, Norhazlina
    Walters, Robert
    Wills, Gary
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS, AND CONTROL TECHNOLOGY (I4CT), 2015,
  • [6] Understanding the Impact of Cache Performance on Multi-core Architectures
    Ramasubramaniam, N.
    Srinivas, V. V.
    Kumar, P. Pavan
    [J]. INFORMATION TECHNOLOGY AND MOBILE COMMUNICATION, 2011, 147 : 403 - 406
  • [7] Improving Branch Prediction for Thread Migration on Multi-core Architectures
    Zhang, Tan
    Zhou, Chaobing
    Huang, Libo
    Xiao, Nong
    Ma, Sheng
    [J]. NETWORK AND PARALLEL COMPUTING (NPC 2017), 2017, 10578 : 87 - 99
  • [8] Heterogeneous multi-core architectures
    Mitra, Tulika
    [J]. IPSJ Transactions on System LSI Design Methodology, 2015, 8 : 51 - 62
  • [9] Performance improvement of bandwidth-limited coherent OCDMA system
    Chen, Xiaogang
    Chen, Deyi
    Wang, Zonglong
    [J]. PHOTONIC NETWORK COMMUNICATIONS, 2008, 16 (02) : 149 - 154
  • [10] Performance improvement of bandwidth-limited coherent OCDMA system
    Xiaogang Chen
    Deyi Chen
    Zonglong Wang
    [J]. Photonic Network Communications, 2008, 16 : 149 - 154