Modeling Cache Contention and Throughput of Multiprogrammed Manycore Processors

被引:13
|
作者
Chen, Xi E. [1 ]
Aamodt, Tor M. [2 ]
机构
[1] NVIDIA Corp, Beaverton, OR 97006 USA
[2] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Analytical modeling; cache contention; manycore; fine-grained multithreading; throughput; PERFORMANCE;
D O I
10.1109/TC.2011.141
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes an analytical model for accurately predicting the impact of contention on cache miss rates. The focus is multiprogrammed workloads running on multithreaded manycore architectures. This work addresses a key challenge facing earlier cache contention models as the number of concurrent threads exceeds the associativity of shared caches. The memory access characteristics of individual applications are obtained in isolation by profiling their circular sequences and two new measures of access locality are proposed. An evaluation of this model in the context of a Niagara processor shows that it achieves an average 8.7 percent error in miss rate predictions which improves upon the best prior model by 48.1x. This paper also presents a novel Markov chain throughput model. When combining the contention model with the Markov chain model, throughput is estimated with an average error of 8.3 percent compared to detailed simulation. Moreover, the combined model tracks throughput sufficiently well to find the same optimized design point for application-specific workloads 65 times faster than detailed simulation. This paper also shows that the models accurately predict cache contention and throughput trends across various workloads on real hardware.
引用
收藏
页码:913 / 927
页数:15
相关论文
共 50 条
  • [1] Reducing Contention in Shared Last-Level Cache for Throughput Processors
    Kuo, Hsien-Kai
    Lai, Bo-Cheng Charles
    Jou, Jing-Yang
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2014, 20 (01) : 1 - 28
  • [2] Characterization and modeling of multicast communication in cache-coherent manycore processors
    Abadal, Sergi
    Martinez, Raul
    Sole-Pareta, Josep
    Alarcon, Eduard
    Cabellos-Aparicio, Albert
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2016, 51 : 168 - 183
  • [3] Exhaustive Evaluation of Memory-Latency Sensitivity on Manycore Processors with Large Cache
    Tanabe, Noboru
    Endo, Toshio
    [J]. 2018 2ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2018), 2018, : 27 - 34
  • [4] Evaluation of the Memory Communication Traffic in a Hierarchical Cache Model for Massively-Manycore Processors
    Al Khanjari, Sharifa
    Vanderbauwhede, Wim
    [J]. 2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP), 2016, : 726 - 733
  • [5] Priority-Based Cache Allocation in Throughput Processors
    Li, Dong
    Rhu, Minsoo
    Johnson, Daniel R.
    O'Connor, Mike
    Erez, Mattan
    Burger, Doug
    Fussell, Donald S.
    Keckler, Stephen W.
    [J]. 2015 IEEE 21ST INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2015, : 89 - 100
  • [6] Fast and accurate thermal modeling and simulation of manycore processors and workloads
    Wojciechowski, Bartosz
    Berezowski, Krzysztof S.
    Patronik, Piotr
    Biernat, Janusz
    [J]. MICROELECTRONICS JOURNAL, 2013, 44 (11) : 986 - 993
  • [7] Methods for modeling resource contention on simultaneous multithreading processors
    Moseley, T
    Kihm, JL
    Connors, DA
    Grunwald, D
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS & PROCESSORS, PROCEEDINGS, 2005, : 373 - 380
  • [8] Online Cache Modeling for Commodity Multicore Processors
    West, Richard
    Zaroo, Puneet
    Waldspurger, Carl A.
    Zhang, Xiao
    [J]. PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2010, : 563 - 564
  • [9] Online cache modeling for commodity multicore processors
    West, Richard
    Waldspurger, Carl A.
    Zaroo, Puneet
    Zhang, Xiao
    [J]. Operating Systems Review (ACM), 2010, 44 (04): : 19 - 29
  • [10] On Linear Learning with Manycore Processors
    Wszola, Eliza
    Mendler-Dunner, Celestine
    Jaggi, Martin
    Pueschel, Markus
    [J]. 2019 IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC), 2019, : 184 - 194