Cross-Architecture Performance Prediction (XAPP) Using CPU Code to Predict GPU Performance

被引:63
|
作者
Ardalani, Newsha [1 ]
Lestourgeon, Clint [1 ]
Sankaralingam, Karthikeyan [1 ]
Zhu, Xiaojin [1 ]
机构
[1] Univ Wisconsin Madison, Madison, WI 53706 USA
关键词
GPU; Cross-platform Prediction; Performance Modeling; Machine Learning; REGRESSION-MODELS; DESIGN SPACE; HARDWARE; BENCHMARKS;
D O I
10.1145/2830772.2830780
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
GPUs have become prevalent and more general purpose, but GPU programming remains challenging and time consuming for the majority of programmers. In addition, it is not always clear which codes will benefit from getting ported to GPU. Therefore, having a tool to estimate GPU performance for a piece of code before writing a GPU implementation is highly desirable. To this end, we propose Cross-Architecture Performance Prediction (XAPP), a machine-learning based technique that uses only single-threaded CPU implementation to predict GPU performance. Our paper is built on the two following insights: i) Execution time on GPU is a function of program properties and hardware characteristics. ii) By examining a vast array of previously implemented GPU codes along with their CPU counterparts, we can use established machine learning techniques to learn this correlation between program properties, hardware characteristics and GPU execution time. We use an adaptive two-level machine learning solution. Our results show that our tool is robust and accurate: we achieve 26.9% average error on a set of 24 real-world kernels. We also discuss practical usage scenarios for XAPP.
引用
收藏
页码:725 / 737
页数:13
相关论文
共 50 条
  • [1] Multicore Performance Prediction with MPET Using Scalability Characteristics for Statistical Cross-Architecture Prediction
    Arndt, Oliver Jakob
    Lueders, Matthias
    Riggers, Christoph
    Blume, Holger
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2020, 92 (09): : 981 - 998
  • [2] Multicore Performance Prediction with MPETUsing Scalability Characteristics for Statistical Cross-Architecture Prediction
    Oliver Jakob Arndt
    Matthias Lüders
    Christoph Riggers
    Holger Blume
    Journal of Signal Processing Systems, 2020, 92 : 981 - 998
  • [3] Predicting Cross-Architecture Performance of Parallel Programs
    Nichols, Daniel
    Movsesyan, Alexander
    Yeom, Jae-Seung
    Sarkar, Abhik
    Milroy, Daniel
    Patki, Tapasya
    Bhatele, Abhinav
    PROCEEDINGS 2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS 2024, 2024, : 570 - 581
  • [4] Correction to: Multicore Performance Prediction with MPETUsing Scalability Characteristics for Statistical Cross-Architecture Prediction
    Oliver Jakob Arndt
    Matthias Lüders
    Christoph Riggers
    Holger Blume
    Journal of Signal Processing Systems, 2021, 93 : 1361 - 1361
  • [5] Performance comparison of CPU and GPU on a discrete heterogeneous architecture
    Thomas, Winnie
    Daruwala, Rohin D.
    2014 INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS, COMMUNICATION AND INFORMATION TECHNOLOGY APPLICATIONS (CSCITA), 2014, : 271 - 276
  • [6] Multicore Performance Prediction with MPET Using Scalability Characteristics for Statistical Cross-Architecture Prediction (vol 92, pg 981, 2020)
    Arndt, Oliver Jakob
    Luders, Matthias
    Riggers, Christoph
    Blume, Holger
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (11): : 1361 - 1361
  • [7] Modeling Cross-Architecture Co-Tenancy Performance Interference
    Kuang, Wei
    Brown, Laura E.
    Wang, Zhenlin
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 231 - 240
  • [8] Performance Prediction of Parallel CPU and GPU Applications Using Fractals<bold> </bold>
    Escobar, Rodrigo
    Boppana, Rajendra V.
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 610 - 617
  • [9] Improving performance of GPU code using novel features of the NVIDIA kepler architecture
    Li, Yuanzhe
    Schwiebert, Loren
    Hailat, Eyad
    Mick, Jason
    Potoff, Jeffrey
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (13): : 3586 - 3605
  • [10] Using the integrated GPU to improve CPU sort performance
    Lupescu, Grigore
    Slusanschi, Emil-Ioan
    Tapus, Nicolae
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW), 2017, : 39 - 44