MEMORY ACCESS COALESCING - A TECHNIQUE FOR ELIMINATING REDUNDANT MEMORY ACCESSES

被引:0
|
作者
DAVIDSON, JW [1 ]
JINTURKAR, S [1 ]
机构
[1] UNIV VIRGINIA,DEPT COMP SCI,CHARLOTTESVILLE,VA 22903
来源
SIGPLAN NOTICES | 1994年 / 29卷 / 06期
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As microprocessor speeds increase, memory bandwidth is increasingly the performance bottleneck for microprocessors. This has occurred because innovation and technological improvements in processor design have outpaced advances in memory design. Most attempts at addressing this problem have involved hardware solutions. Unfortunately, these solutions do little to help the situation with respect to current microprocessors. In previous work, we developed, implemented, and evaluated an algorithm that exploited the ability of newer machines with wide-buses to load/store multiple floating-point operands in a single memory reference. This paper describes a general code improvement algorithm that transforms code to better exploit the available memory bandwidth on existing microprocessors as well as wide-bus machines. Where possible and advantageous, the algorithm coalesces narrow memory references into wide ones. An interesting characteristic of the algorithm is that some decisions about the applicability of the transformation are made at run time. This dynamic analysis significantly increases the probability of the transformation being applied. The code improvement transformation was implemented and added to the repertoire of code improvements of an existing retargetable optimizing back end. Using three current architectures as evaluation platforms, the effectiveness of the transformation was measured on a set of compute- and memory-intensive programs. Interestingly, the effectiveness of the transformation varied significantly with respect to the instruction-set architecture of the tested platform. For one of the tested architectures, improvements in execution speed ranging from 5 to 40 percent were observed. For another, the improvements in execution speed ranged from 5 to 20 percent, while for vet another, the transformation resulted in slower code for all programs.
引用
收藏
页码:186 / 195
页数:10
相关论文
共 50 条
  • [1] Collaborative Coalescing of Redundant Memory Access for GPU System
    Jiang, Fan
    Li, Chengeng
    Zhang, Wei
    Xu, Jiang
    [J]. 29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 195 - 200
  • [2] Two methods for combining original memory access coalescing and equivalent memory access coalescing on GPGPU
    Pei, Yulong
    Yu, Licheng
    Wu, Minghui
    Chen, Tianzhou
    Lou, Xueqing
    Zhang, Tiefei
    [J]. 2016 13TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS) - PROCEEDINGS, 2016, : 48 - 53
  • [3] Equidistant Memory Access Coalescing on GPGPU
    Pei, Yulong
    Yu, Licheng
    Wu, Minghui
    Chen, Tianzhou
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 272 - 277
  • [4] On Reducing Hidden Redundant Memory Accesses for DSP Applications
    Wang, Meng
    Shao, Zili
    Xue, Jingling
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2011, 19 (06) : 997 - 1010
  • [5] Instruction combining for coalescing memory accesses using global code motion
    Kawahito, Motohiro
    Komatsu, Hideaki
    Nakatani, Toshio
    [J]. Proc. ACM SIGPLAN Workshop Mem. Syst. Perform., MSP, 1600, (2-11):
  • [6] UVM Discard: Eliminating Redundant Memory Transfers for Accelerators
    Zhu, Weixi
    Cox, Guilherme
    Vesely, Jan
    Hairgrove, Mark
    Cox, Alan L.
    Rixner, Scott
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2022), 2022, : 27 - 38
  • [7] Fork Path: Batching ORAM Requests to Remove Redundant Memory Accesses
    Zhu, Jingchen
    Sun, Guangyu
    Zhang, Xian
    Zhang, Chao
    Zhang, Weiqi
    Liang, Yun
    Wang, Tao
    Chen, Yiran
    Di, Jia
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (10) : 2279 - 2292
  • [8] A memory scheduling strategy for eliminating memory access interference in heterogeneous system
    Fang, Juan
    Wang, Mengxuan
    Wei, Zelin
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (04): : 3129 - 3154
  • [9] A memory scheduling strategy for eliminating memory access interference in heterogeneous system
    Juan Fang
    Mengxuan Wang
    Zelin Wei
    [J]. The Journal of Supercomputing, 2020, 76 : 3129 - 3154
  • [10] Memory Coalescing for Hybrid Memory Cube
    Wang, Xi
    Leidel, John D.
    Chen, Yong
    [J]. PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,