Machine Learning-Based Temperature Prediction for Runtime Thermal Management Across System Components

被引:63
|
作者
Zhang, Kaicheng [1 ]
Guliani, Akhil [1 ]
Ogrenci-Memik, Seda [1 ]
Memik, Gokhan [1 ]
Yoshii, Kazutomo [2 ]
Sankaran, Rajesh [2 ]
Beckman, Pete [2 ]
机构
[1] Northwestern Univ, Dept Elect Engn & Comp Sci, 2145 Sheridan Rd, Evanston, IL 60208 USA
[2] Argonne Natl Lab, Math & Comp Sci Div, 9700 South Cass Ave, Argonne, IL 60439 USA
基金
美国国家科学基金会;
关键词
Thermal modeling; many-core processors; operating systems; high performance computing systems; NEURAL-NETWORKS; WORKLOAD;
D O I
10.1109/TPDS.2017.2732951
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Elevated temperatures limit the peak performance of systems because of frequent interventions by thermal throttling. Non-uniform thermal states across system nodes also cause performance variation within seemingly equivalent nodes leading to significant degradation of overall performance. In this paper we present a framework for creating a lightweight thermal prediction system suitable for run-time management decisions. We pursue two avenues to explore optimized lightweight thermal predictors. First, we use feature selection algorithms to improve the performance of previously designed machine learning methods. Second, we develop alternative methods using neural network and linear regression-based methods to perform a comprehensive comparative study of prediction methods. We show that our optimized models achieve improved performance with better prediction accuracy and lower overhead as compared with the Gaussian process model proposed previously. Specifically we present a reduced version of the Gaussian process model, a neural network-based model, and a linear regression-based model. Using the optimization methods, we are able to reduce the average prediction errors in the Gaussian process from 4: 2 degrees C to 2: 9 degrees C. We also show that the newly developed models using neural network and Lasso linear regression have average prediction errors of 2: 9 degrees C and 3: 8 degrees C respectively. The prediction overheads are 0.22, 0.097, and 0.026 ms per prediction for reduced Gaussian process, neural network, and Lasso linear regression models, respectively, compared with 0.57 ms per prediction for the previous Gaussian process model. We have implemented our proposed thermal prediction models on a two-node system configuration to help identify the optimal task placement. The task placement identified by the models reduces the average system temperature by up to 11: 9 degrees C without any performance degradation. Furthermore, these models respectively achieve 75, 82.5, and 74.17 percent success rates in correctly pointing to those task placements with better thermal response, compared with 72.5 percent success for the original model in achieving the same objective. Finally, we extended our analysis to a 16-node system and we were able to train models and execute them in real time to guide task migration and achieve on average 17 percent reduction in the overall system cooling power.
引用
收藏
页码:405 / 419
页数:15
相关论文
共 50 条
  • [1] Online Machine Learning-based Temperature Prediction for Thermal-aware NoC System
    Chen, Kun-Chih
    Liao, Yuan-Hou
    [J]. 2019 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2019, : 65 - 66
  • [2] Adaptive Machine Learning-based Temperature Prediction Scheme for Thermal-aware NoC System
    Chen, Kun-Chih
    Liao, Yuan-Hao
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [3] Machine Learning-based Prediction for Dynamic, Runtime Architectural Optimizations of Embedded Systems
    Vazquez, Ruben
    Gordon-Ross, Ann
    Stitt, Greg
    [J]. 2019 IEEE NORDIC CIRCUITS AND SYSTEMS CONFERENCE (NORCAS) - NORCHIP AND INTERNATIONAL SYMPOSIUM OF SYSTEM-ON-CHIP (SOC), 2019,
  • [4] Machine Learning-Based Prediction of the Martensite Start Temperature
    Wentzien, Marcel
    Koch, Marcel
    Friedrich, Thomas
    Ingber, Jerome
    Kempka, Henning
    Schmalzried, Dirk
    Kunert, Maik
    [J]. STEEL RESEARCH INTERNATIONAL, 2024,
  • [5] Characterizing Machine Learning-Based Runtime Prefetcher Selection
    Alcorta, Erika S.
    Madhav, Mahesh
    Afoakwa, Richard
    Tetrick, Scott
    Yadwadkar, Neeraja J.
    Gerstlauer, Andreas
    [J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2024, 23 (02) : 146 - 149
  • [6] Machine Learning-Based Academic Result Prediction System
    Bhushan, Megha
    Verma, Utkarsh
    Garg, Chetna
    Negi, Arun
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2024, 12 (01)
  • [7] MACHINE LEARNING-BASED ENERGY USE PREDICTION FOR THE SMART BUILDING ENERGY MANAGEMENT SYSTEM
    Sari, Mustika
    Berawi, Mohammed Ali
    Zagloel, Teuku Yuri
    Madyaningarum, Nunik
    Miraj, Perdana
    Pranoto, Ardiansyah Ramadhan
    Susantono, Bambang
    Woodhead, Roy
    [J]. JOURNAL OF INFORMATION TECHNOLOGY IN CONSTRUCTION, 2023, 28 : 621 - 644
  • [8] Machine learning-based prediction of transfusion
    Mitterecker, Andreas
    Hofmann, Axel
    Trentino, Kevin M.
    Lloyd, Adam
    Leahy, Michael F.
    Schwarzbauer, Karin
    Tschoellitsch, Thomas
    Boeck, Carl
    Hochreiter, Sepp
    Meier, Jens
    [J]. TRANSFUSION, 2020, 60 (09) : 1977 - 1986
  • [9] Machine learning-based beta transus temperature prediction for titanium alloys
    Niu, Yong
    Hong, Zhi-qiang
    Wang, Yao-qi
    Zhu, Yan-chun
    [J]. JOURNAL OF MATERIALS RESEARCH AND TECHNOLOGY-JMR&T, 2023, 23 : 515 - 529
  • [10] Machine learning-based radiotherapy time prediction and treatment scheduling management
    Xie, Lisiqi
    Xu, Dan
    He, Kangjian
    Tian, Xin
    [J]. JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2023, 24 (09):