GPU Auto-tuning Framework for Optimal Performance and Power Consumption

被引：0

作者：

Cheema, Sunbal ^{[1
]}

Khan, Gul N. ^{[1
]}

机构：

[1] Toronto Metropolitan Univ, Dept Elect Comp & Biomed Engn, Toronto, ON, Canada

来源：

15TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPU, GPGPU 2023 | 2023年

关键词：

Auto-tuning; Code transformation; Multi-objective optimization; GPU code regeneration; Performance power optimization; EFFICIENCY;

D O I：

10.1145/3589236.3589241

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

An auto-tuning framework for GPU devices is presented for tuning application kernels of OpenCL. The GPU tuner employs multi-objective optimization methodology to improve the performance and power consumption of applications. It efficiently explores a user defined solution space comprising of possible tunable algorithmic and hardware counter variations through code transformations. The methodology targets GPU code tuning situations where performance and energy consumption are critical. The proposed framework is evaluated for 2D convolution kernels. It utilizes a non-dominated sorting Genetic Algorithm with hardware power sensor data for application code transformation through code rewrite and validation. Various algorithmic variations such as loop unrolling, caching, workgroup size and memory utilization are applied. The final pareto optimal configurations code utilized around 30% less power and 4% faster execution time. The analysis shows the convergence of optimization, and 45% improvement in standard deviation.

引用

页码：1 / 6

页数：6

共 50 条

[1] Bayesian Optimization for auto-tuning GPU kernels
Willemsen, Floris-Jan
van Nieuwpoort, Rob
van Werkhoven, Ben
PROCEEDINGS OF PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS 2021), 2021, : 106 - 117
[2] Optimizing and Auto-tuning Belief Propagation on the GPU
Grauer-Gray, Scott
Cavazos, John
LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2011, 6548 : 121 - 135
[3] Toward Techniques for Auto-tuning GPU Algorithms
Davidson, Andrew
Owens, John
APPLIED PARALLEL AND SCIENTIFIC COMPUTING, PT II, 2012, 7134 : 110 - 119
[4] Adaptive GPU Array Layout Auto-Tuning
Weber, Nicolas
Goesele, Michael
PROCEEDINGS OF THE ACM WORKSHOP ON SOFTWARE ENGINEERING METHODS FOR PARALLEL AND HIGH PERFORMANCE APPLICATIONS (SEM4HPC'16), 2016, : 21 - 28
[5] A History-Based Auto-Tuning Framework for Fast and High-Performance DNN Design on GPU
Mu, Jiandong
Wang, Mengdi
Li, Lanbo
Yang, Jun
Lin, Wei
Zhang, Wei
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[6] ATF: A Generic Auto-Tuning Framework
Rasch, Ari
Haidl, Michael
Gorlatch, Sergei
2017 19TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS (HPCC) / 2017 15TH IEEE INTERNATIONAL CONFERENCE ON SMART CITY (SMARTCITY) / 2017 3RD IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (DSS), 2017, : 64 - 71
[7] ATF: A Generic Auto-Tuning Framework
Rasch, Ari
Gorlatch, Sergei
HPDC '18: PROCEEDINGS OF THE 27TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING: POSTERS/DOCTORAL CONSORTIUM, 2018, : 3 - 4
[8] Meta-programming and Auto-tuning in the Search for High Performance GPU Code
Vollmer, Michael
Svensson, Bo Joel
Holk, Eric
Newton, Ryan R.
FHPC'15 PROCEEDINGS OF THE 4TH ACM SIGPLAN WORKSHOP ON FUNCTIONAL HIGH-PERFORMANCE COMPUTING, 2015, : 1 - 11
[9] Testing and Auto-Tuning GPU code with Kernel Tuner
van Werkhoven, Ben
2019 18TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC 2019), 2019, : XXI - XXI
[10] Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)
Rasch, Ari
Schulze, Richard
Steuwer, Michel
Gorlatch, Sergei
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (01)

← 1 2 3 4 5 →