Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors

被引:0
|
作者
Bhattacharjee, Abhishek [1 ]
Martonosi, Margaret [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
关键词
Thread Criticality Prediction; Parallel Processing; Intel TBB; DVFS; Caches;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the shift towards chip multiprocessors (CMPs), exploiting and managing parallelism has become a central problem in computer systems. Many issues of parallelism management boil down to discerning which running threads or processes are critical, or slowest, versus which are non-critical If one can accurately predict critical threads in a parallel program, then one can respond in a variety of ways. Possibilities include running the critical thread at a faster clock rate, performing load balancing techniques to offload work onto currently non-critical threads, or giving the critical thread more on-chip resources to execute faster. This paper proposes and evaluates simple but effective thread criticality predictors for parallel applications. We show that accurate predictors can be built using counters that are typically already available on-chip. Our predictor, based on memory hierarchy statistics, identifies thread criticality with an average accuracy of 93% across a range of architectures. We also demonstrate two applications of our predictor. First, we show how Intel's Threading Building Blocks (TBB) parallel runtime system can benefit from task stealing techniques that use our criticality predictor to reduce load imbalance. Using criticality prediction to guide TBB's task-stealing decisions improves performance by 13-32% for TBB-based PARSEC benchmarks running on a 32-core CMP. As a second application, criticality prediction guides dynamic energy optimizations in barrier-based applications. By running the predicted critical thread at the full clock rate and frequency-scaling non-critical threads, this approach achieves average energy savings of 15% while negligibly degrading performance for SPLASH-2 and PARSEC benchmarks.
引用
收藏
页码:290 / 301
页数:12
相关论文
共 50 条
  • [1] Power-performance implications of thread-level parallelism on chip multiprocessors
    Li, J
    Martínez, JF
    [J]. ISPASS 2005: IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2005, : 124 - 134
  • [2] Dynamic QoS Management for Chip Multiprocessors
    Li, Bin
    Peh, Li-Shiuan
    Zhao, Li
    Iyer, Ravi
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2012, 9 (03)
  • [3] Dynamic power-performance adaptation of parallel computation on chip multiprocessors
    Li, Jian
    Martinez, Jose F.
    [J]. TWELFTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2006, : 77 - +
  • [4] Accelerating sequential programs on Chip Multiprocessors via Dynamic Prefetching Thread
    Rui, Hou
    Zhang, Longbing
    Hu, Weiwu
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2007, 31 (03) : 200 - 211
  • [5] Dynamic Resource Tuning for Flexible Core Chip Multiprocessors
    Ren, Yongqing
    An, Hong
    Sun, Tao
    Cong, Ming
    Wang, Yaobin
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PT 2, PROCEEDINGS, 2010, 6082 : 32 - 41
  • [6] Dynamic Lifetime Reliability Management for Chip Multiprocessors
    Moghaddam, Milad Ghorbani
    Ababei, Cristinel
    [J]. IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (04): : 952 - 958
  • [7] SMT-Centric Power-Aware Thread Placement in Chip Multiprocessors
    Vega, Augusto
    Buyuktosunoglu, Alper
    Bose, Pradip
    [J]. 2013 22ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2013, : 167 - 176
  • [8] Postponing Wearout Failures in Chip Multiprocessors Using Thermal Management and Thread Migration
    Kashefi, Elham
    Zarandi, Hamid R.
    Gordon-Ross, Ann
    [J]. 2016 11TH INTERNATIONAL SYMPOSIUM ON RECONFIGURABLE COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC), 2016,
  • [9] DPPC: Dynamic Power Partitioning and Capping in Chip Multiprocessors
    Ma, Kai
    Wang, Xiaorui
    Wang, Yefu
    [J]. 2011 IEEE 29TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2011, : 39 - 44
  • [10] Compositional, Dynamic Cache Management for Embedded Chip Multiprocessors
    Anca M. Molnos
    Sorin D. Cotofana
    Marc J. M. Heijligers
    Jos T. J. van Eijndhoven
    [J]. Journal of Signal Processing Systems, 2009, 57 : 155 - 172