System-level dynamic thermal management for high-performance microprocessors

被引:56
|
作者
Kumar, Amit [1 ]
Shang, Li [2 ]
Peh, Li-Shiuan [1 ]
Jha, Niraj K. [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
[2] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada
基金
美国国家科学基金会;
关键词
dynamic thermal management; hybrid hardware-software management; thermal model;
D O I
10.1109/TCAD.2007.907062
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Thermal issues are fast becoming major design constraints in high-performance systems. Temperature variations adversely affect system reliability and prompt worst-case design. In recent history, researchers have proposed dynamic thermal-management (DTM) techniques targeting average-case design and tackling the temperature issue at runtime. While past work on DTM has focused on different techniques in isolation, it fails to consider a system-level approach which uses both hardware and software support in a synergistic fashion and hence leads to a significant execution-time overhead. In this paper, we propose HybDTM, a system-level framework for doing fine-grained coordinated thermal management using a hybrid of hardware techniques (like clock gating) and software techniques (like thermal-aware process scheduling), leveraging the advantages of both approaches in a synergistic fashion. We show that while hardware techniques can be used reactively to manage the overall temperature in case of thermal emergencies, proactive use of software techniques can build on top of it to balance the overall thermal profile with minimal overhead using the operating system (OS) support. In order to evaluate our proposed hybrid-DTM policy, we develop a novel regression-based thermal model, providing fast and accurate temperature estimates to do runtime thermal characterization of all applications running on the system, using hardware performance counters available in modern high-performance processors alongside thermal sensors for training the model at runtime. Our model is validated against actual temperature measurements from online thermal sensors, with the average estimation error found to be less than 5%. We also study system-level DTM issues, jointly considering both the processor and memory, and show how a unified DTM approach can benefit from global knowledge of individual system components. We evaluate our proposed methodology on a desktop system with an Intel Pentium-4 processor and a modified Linux OS, running a number of SPEC2000 benchmarks, in both uniprocessor and simultaneous multithreaded environments and show that our proposed technique is able to successfully manage the overall temperature with an average execution-time overhead of only 10.4% (20.1% maximum) compared to the case without any DTM, as opposed to 23.9% (46% maximum) overhead for purely hardware-based DTM. Our system, including the thermal-aware OS, built-in runtime thermal-characterization model, and interface to the underlying hardware using the Pentium-4 processor, is ready for release.
引用
收藏
页码:96 / 108
页数:13
相关论文
共 50 条
  • [21] IMPLEMENTING THERMAL MANAGEMENT MODELING INTO SOFC SYSTEM-LEVEL DESIGN
    Kattke, K. J.
    Braun, R. J.
    PROCEEDINGS OF THE ASME 8TH INTERNATIONAL CONFERENCE ON FUEL CELL SCIENCE, ENGINEERING, AND TECHNOLOGY 2010, VOL 2, 2010, : 295 - 308
  • [22] System-level thermal management and concurrent system design of a wearable multicomputer system
    Amon, C
    Chae, K
    Egan, E
    Kasabach, C
    Siewiorek, D
    Smailagic, A
    Stivoric, J
    INTERSOCIETY CONFERENCE ON THERMAL PHENOMENA IN ELECTRONIC SYSTEMS - I-THERM V, 1996, : 46 - 55
  • [23] A clock methodology for high-performance microprocessors
    Carrig, KM
    Chu, AM
    Ferraiolo, FD
    Petrovick, JG
    Scott, PA
    Weiss, RJ
    PROCEEDINGS OF THE IEEE 1997 CUSTOM INTEGRATED CIRCUITS CONFERENCE, 1997, : 119 - 122
  • [24] Reducing power in high-performance microprocessors
    Tiwari, V
    Singh, D
    Rajgopal, S
    Mehta, G
    Patel, R
    Baez, F
    1998 DESIGN AUTOMATION CONFERENCE, PROCEEDINGS, 1998, : 732 - 737
  • [25] HIGH-PERFORMANCE MICROPROCESSORS - THE RISC DILEMMA
    HUANG, VKL
    IEEE MICRO, 1989, 9 (04) : 13 - 14
  • [26] A clock methodology for high-performance microprocessors
    Carrig, KM
    Chu, AM
    Ferraiolo, FD
    Petrovick, JG
    Scott, PA
    Weiss, RJ
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 1997, 16 (2-3): : 217 - 224
  • [27] A Clock Methodology for High-Performance Microprocessors
    Keith M. Carrig
    Albert M. Chu
    Frank D. Ferraiolo
    John G. Petrovick
    P. Andrew Scott
    Richard J. Weiss
    Journal of VLSI signal processing systems for signal, image and video technology, 1997, 16 : 217 - 224
  • [28] Technology for advanced high-performance microprocessors
    Bohr, MT
    El-Mansy, YA
    IEEE TRANSACTIONS ON ELECTRON DEVICES, 1998, 45 (03) : 620 - 625
  • [29] The future evolution of high-performance microprocessors
    Jouppi, N
    MICRO-38: Proceedings of the 38th Annual IEEE/ACM International Symposiumn on Microarchitecture, 2005, : 155 - 155
  • [30] Using Input-to-Output Masking for System-Level Vulnerability Estimation in High-Performance Processors
    Haghdoost, Alireza
    Asadi, Hossein
    Baniasadi, Amirali
    15TH CSI INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND DIGITAL SYSTEMS (CADS 2010), 2010, : 91 - 98