Compile Time Modeling of Off-Chip Memory Bandwidth for Parallel Loops

被引:0
|
作者
Tolubaeva, Munara [1 ]
Yan, Yonghong [1 ]
Chapman, Barbara [1 ]
机构
[1] Univ Houston, Dept Comp Sci, Houston, TX 77204 USA
关键词
Off-chip memory bandwidth; Performance modeling; Parallel loops; Contentions;
D O I
10.1007/978-3-319-09967-5_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a statistical model to predict the off-chip memory bandwidth required by a parallel loop during its execution. It is a compile-time modeling technique that derives the correlations between memory bandwidth requirement and data access patterns of multithreaded applications. This model could be used by the compiler and performance tools to predict when the sustainable memory bandwidth of the system will be reached by the application during execution, and to determine an optimal number of threads that should be configured to execute a specific parallel loop according to its memory reference patterns. Awareness of the performance impact of oversubscribed memory bandwidth can also help programmers to take into account the additional latency caused by the contention, and to minimize the overhead by tuning the memory access behavior of applications. We evaluated this model in terms of both technical accuracy and prediction accuracy by comparing the modeling results with the measured results. The evaluation demonstrates its accuracy in both system bandwidth modeling and application bandwidth modeling.
引用
收藏
页码:292 / 306
页数:15
相关论文
共 50 条
  • [1] An Analytical Performance Model for Partitioning Off-Chip Memory Bandwidth
    Wang, Ruisheng
    Chen, Lizhong
    Pinkston, Timothy Mark
    IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 165 - 176
  • [2] Flex Memory: Exploiting and Managing Abundant Off-Chip Optical Bandwidth
    Wang, Ying
    Zhang, Lei
    Han, Yinhe
    Li, Huawei
    Li, Xiaowei
    2011 DESIGN, AUTOMATION & TEST IN EUROPE (DATE), 2011, : 968 - 973
  • [3] Accurately modeling the on-chip and off-chip GPU memory subsystem
    Candel, Francisco
    Petit, Salvador
    Sahuquillo, Julio
    Duato, Jose
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 82 : 510 - 519
  • [4] Compression for reduction of off-chip video bandwidth
    Jaspers, EGT
    de With, PHN
    MEDIA PROCESSORS 2002, 2002, 4674 : 110 - 120
  • [5] Understanding How Off-Chip Memory Bandwidth Partitioning in Chip Multiprocessors Affects System Performance
    Liu, Fang
    Jiang, Xiaowei
    Solihin, Yan
    HPCA-16 2010: SIXTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2010, : 57 - 68
  • [6] OSM: Off-Chip Shared Memory for GPUs
    Darabi, Sina
    Yousefzadeh-Asl-Miandoab, Ehsan
    Akbarzadeh, Negar
    Falahati, Hajar
    Lotfi-Kamran, Pejman
    Sadrosadati, Mohammad
    Sarbazi-Azad, Hamid
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3415 - 3429
  • [7] High Bandwidth Off-Chip Memory Access Through Hybrid Switching and Inter-Chip Wireless Links
    Gade, Sri Harsha
    Mondal, Hemanta Kumar
    Deb, Sujay
    2018 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI), 2018, : 100 - 105
  • [8] Off-Chip Memory Bandwidth Minimization through Cache Partitioning for Multi-Core Platforms
    Yu, Chenjie
    Petrov, Peter
    PROCEEDINGS OF THE 47TH DESIGN AUTOMATION CONFERENCE, 2010, : 132 - 137
  • [9] Using Switchable Pins to Increase Off-Chip Bandwidth in Chip-Multiprocessors
    Chen, Shaoming
    Irving, Samuel
    Peng, Lu
    Hu, Yue
    Zhang, Ying
    Srivastava, Ashok
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (01) : 274 - 289
  • [10] Off-Chip Memory Allocation for Neural Processing Units
    Kvochko, Andrey
    Maltsev, Evgenii
    Balyshev, Artem
    Malakhov, Stanislav
    Efimov, Alexander
    IEEE ACCESS, 2024, 12 : 9931 - 9939