Hybrid, Scalable, Trace -Driven Performance Modeling of CPCPUs

被引:7
|
作者
Arafa, Yehia [1 ]
Badawy, Abdel-Hameed [1 ]
ElWazir, Ammar [1 ]
Barai, Atanu [1 ]
Eker, Ali [2 ]
Chennupati, Gopinath [3 ]
Santhi, Nandakishore [4 ]
Eidenbenz, Stephan [4 ]
机构
[1] New Mexico State Univ, Klipsch Sch ECE, Las Cruces, NM 88003 USA
[2] Binghamton Univ, Binghamton, NY USA
[3] Amazon Alexa, New York, NY USA
[4] Los Alamos Natl Lab, Los Alamos, NM USA
关键词
NVIDIA GPUs; Modeling and Simulation; Design Space Exploration; Performance Prediction; PTX; SASS; GPU; ROOFLINE;
D O I
10.1145/3458817.3476221
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present PPT-CPU, a scalable performance prediction toolkit for GPUs. PPT-GPU achieves scalability through a hybrid high-level modeling approach where sonic. computations are extrapolated and multiple parts of the model are parallelized. The tool primary prediction models use pre-collected memory and instructions traces of the workloads to accurately capture the dynamic behavior of the kernels. PPT-CPU reports an extensive array of CPU performance metrics accurately while being easily extensible. We use a broad set of benchmarks to verify predictions accuracy. We compare the results against hardware metrics collected using vendor profiling tools and cycle -accurate simulators. The results show that the performance predictions are highly correlated to the actual hardware (MAPE: < 16% and Correlation: > 0.98). Moreover, PPT-CPU is orders of magnitude faster than cycle -accurate simulators. This comprehensiveness of the collected metrics can guide arcifitects and developers to perform design space explorations. Moreover, the scalability of the tool enables corldWiting efficient and fast sensitivity analyses for performance -critical applications.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Practical and Scalable ML-Driven Cloud Performance Debugging With Sage
    Gan, Yu
    Liang, Mingyu
    Dev, Sundar
    Lo, David
    Delimitrou, Christina
    IEEE MICRO, 2022, 42 (04) : 27 - 36
  • [22] Congestion and performance driven full-chip scalable routing framework
    Yao, HL
    Cai, YC
    Hong, XL
    Zhou, Q
    2005 6TH INTERNATIONAL CONFERENCE ON ASIC PROCEEDINGS, BOOKS 1 AND 2, 2005, : 768 - 771
  • [23] Accurately modeling speculative instruction fetching in trace-driven simulation
    Bhargava, R
    John, LK
    Matus, F
    1999 IEEE INTERNATIONAL PERFORMANCE, COMPUTING AND COMMUNICATIONS CONFERENCE, 1999, : 65 - 71
  • [24] TRACE-DRIVEN MODELING AND ANALYSIS OF CPU SCHEDULING IN A MULTIPROGRAMMING SYSTEM
    SHERMAN, S
    BROWNE, JC
    BASKETT, F
    COMMUNICATIONS OF THE ACM, 1972, 15 (12) : 1063 - &
  • [25] HYBRID COMPUTER PERFORMANCE MODELING SYSTEM
    FOXLEY, E
    COMPUTER JOURNAL, 1978, 21 (03): : 205 - 209
  • [26] Performance modeling of scalable encryption algorithm using parallel computation
    2013, UK Simulation Society, Clifton Lane, Nottingham, NG11 8NS, United Kingdom (14):
  • [27] Performance modeling of distributed hybrid architectures
    Spinnato, PF
    van Albada, GD
    Sloot, PMA
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2004, 15 (01) : 81 - 92
  • [28] Performance Modeling of Scalable Resource Allocations with the Imperial PEPA Compiler
    Sanders, William S.
    Srivastava, Srishti
    Banicescu, Ioana
    2022 21ST INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC 2022), 2022, : 99 - 106
  • [29] Modeling and performance analysis of Scalable Web Servers Deployed on the Cloud
    Aljohani, A. M. D.
    Holton, D. R. W.
    Awan, I.
    2013 EIGHTH INTERNATIONAL CONFERENCE ON BROADBAND, WIRELESS COMPUTING, COMMUNICATION AND APPLICATIONS (BWCCA 2013), 2013, : 238 - 242
  • [30] Modeling laser performance of scalable side pumped alkali laser
    Komashko, Aleksey M.
    Zweiback, Jason
    HIGH ENERGY/AVERAGE POWER LASERS AND INTENSE BEAM APPLICATIONS IV, 2010, 7581