Compile Time Modeling of Off-Chip Memory Bandwidth for Parallel Loops

被引：0

作者：

Tolubaeva, Munara ^{[1
]}

Yan, Yonghong ^{[1
]}

Chapman, Barbara ^{[1
]}

机构：

[1] Univ Houston, Dept Comp Sci, Houston, TX 77204 USA

来源：

LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2013 | 2014年 / 8664卷

关键词：

Off-chip memory bandwidth; Performance modeling; Parallel loops; Contentions;

D O I：

10.1007/978-3-319-09967-5_17

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we present a statistical model to predict the off-chip memory bandwidth required by a parallel loop during its execution. It is a compile-time modeling technique that derives the correlations between memory bandwidth requirement and data access patterns of multithreaded applications. This model could be used by the compiler and performance tools to predict when the sustainable memory bandwidth of the system will be reached by the application during execution, and to determine an optimal number of threads that should be configured to execute a specific parallel loop according to its memory reference patterns. Awareness of the performance impact of oversubscribed memory bandwidth can also help programmers to take into account the additional latency caused by the contention, and to minimize the overhead by tuning the memory access behavior of applications. We evaluated this model in terms of both technical accuracy and prediction accuracy by comparing the modeling results with the measured results. The evaluation demonstrates its accuracy in both system bandwidth modeling and application bandwidth modeling.

引用

页码：292 / 306

页数：15

共 50 条

[1] An Analytical Performance Model for Partitioning Off-Chip Memory Bandwidth
Wang, Ruisheng
Chen, Lizhong
Pinkston, Timothy Mark
IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 165 - 176
[2] Flex Memory: Exploiting and Managing Abundant Off-Chip Optical Bandwidth
Wang, Ying
Zhang, Lei
Han, Yinhe
Li, Huawei
Li, Xiaowei
2011 DESIGN, AUTOMATION & TEST IN EUROPE (DATE), 2011, : 968 - 973
[3] Accurately modeling the on-chip and off-chip GPU memory subsystem
Candel, Francisco
Petit, Salvador
Sahuquillo, Julio
Duato, Jose
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 82 : 510 - 519
[4] Compression for reduction of off-chip video bandwidth
Jaspers, EGT
de With, PHN
MEDIA PROCESSORS 2002, 2002, 4674 : 110 - 120
[5] Understanding How Off-Chip Memory Bandwidth Partitioning in Chip Multiprocessors Affects System Performance
Liu, Fang
Jiang, Xiaowei
Solihin, Yan
HPCA-16 2010: SIXTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2010, : 57 - 68
[6] OSM: Off-Chip Shared Memory for GPUs
Darabi, Sina
Yousefzadeh-Asl-Miandoab, Ehsan
Akbarzadeh, Negar
Falahati, Hajar
Lotfi-Kamran, Pejman
Sadrosadati, Mohammad
Sarbazi-Azad, Hamid
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3415 - 3429
[7] High Bandwidth Off-Chip Memory Access Through Hybrid Switching and Inter-Chip Wireless Links
Gade, Sri Harsha
Mondal, Hemanta Kumar
Deb, Sujay
2018 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI), 2018, : 100 - 105
[8] Off-Chip Memory Bandwidth Minimization through Cache Partitioning for Multi-Core Platforms
Yu, Chenjie
Petrov, Peter
PROCEEDINGS OF THE 47TH DESIGN AUTOMATION CONFERENCE, 2010, : 132 - 137
[9] Using Switchable Pins to Increase Off-Chip Bandwidth in Chip-Multiprocessors
Chen, Shaoming
Irving, Samuel
Peng, Lu
Hu, Yue
Zhang, Ying
Srivastava, Ashok
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (01) : 274 - 289
[10] Off-Chip Memory Allocation for Neural Processing Units
Kvochko, Andrey
Maltsev, Evgenii
Balyshev, Artem
Malakhov, Stanislav
Efimov, Alexander
IEEE ACCESS, 2024, 12 : 9931 - 9939

← 1 2 3 4 5 →