MEMORY ACCESS COALESCING - A TECHNIQUE FOR ELIMINATING REDUNDANT MEMORY ACCESSES

被引:0
|
作者
DAVIDSON, JW [1 ]
JINTURKAR, S [1 ]
机构
[1] UNIV VIRGINIA,DEPT COMP SCI,CHARLOTTESVILLE,VA 22903
来源
SIGPLAN NOTICES | 1994年 / 29卷 / 06期
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As microprocessor speeds increase, memory bandwidth is increasingly the performance bottleneck for microprocessors. This has occurred because innovation and technological improvements in processor design have outpaced advances in memory design. Most attempts at addressing this problem have involved hardware solutions. Unfortunately, these solutions do little to help the situation with respect to current microprocessors. In previous work, we developed, implemented, and evaluated an algorithm that exploited the ability of newer machines with wide-buses to load/store multiple floating-point operands in a single memory reference. This paper describes a general code improvement algorithm that transforms code to better exploit the available memory bandwidth on existing microprocessors as well as wide-bus machines. Where possible and advantageous, the algorithm coalesces narrow memory references into wide ones. An interesting characteristic of the algorithm is that some decisions about the applicability of the transformation are made at run time. This dynamic analysis significantly increases the probability of the transformation being applied. The code improvement transformation was implemented and added to the repertoire of code improvements of an existing retargetable optimizing back end. Using three current architectures as evaluation platforms, the effectiveness of the transformation was measured on a set of compute- and memory-intensive programs. Interestingly, the effectiveness of the transformation varied significantly with respect to the instruction-set architecture of the tested platform. For one of the tested architectures, improvements in execution speed ranging from 5 to 40 percent were observed. For another, the improvements in execution speed ranged from 5 to 20 percent, while for vet another, the transformation resulted in slower code for all programs.
引用
收藏
页码:186 / 195
页数:10
相关论文
共 50 条
  • [31] Vector Runahead for Indirect Memory Accesses
    Naithani, Ajeya
    Ainsworth, Sam
    Jones, Timothy M.
    Eeckhout, Lieven
    [J]. IEEE MICRO, 2022, 42 (04) : 116 - 123
  • [32] The Synchronization Power of Coalesced Memory Accesses
    Ha, Phuong Hoai
    Tsigas, Philippas
    Anshus, Otto J.
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2010, 21 (07) : 939 - 953
  • [33] Future Scaling of Memory Hierarchy for Tensor Cores and Eliminating Redundant Shared Memory Traffic Using Inter-Warp Multicasting
    Lee, Sunjung
    Hwang, Seunghwan
    Kim, Michael Jaemin
    Choi, Jaewan
    Ahn, Jung Ho
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (12) : 3115 - 3126
  • [34] MEMORY FOR REDUNDANT INFORMATION
    POTTS, GR
    [J]. MEMORY & COGNITION, 1973, 1 (04) : 467 - 470
  • [35] Data prefetching technique of nonlinear memory access
    Wu, Jiajun
    Feng, Xiaobing
    Zhang, Zhaoqing
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2007, 44 (02): : 355 - 360
  • [36] HCMA: Supporting High Concurrency of Memory Accesses with Scratchpad Memory in FPGAs
    Zhao, Yangyang
    Liu, Yuhang
    Li, Wei
    Chen, Mingyu
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2019, : 33 - 40
  • [37] HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System Explorations
    Panda, Reena
    John, Lizy K.
    [J]. INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2018), 2018, : 118 - 128
  • [38] Parallel memory implementation for arbitrary stride accesses
    Aho, Eero
    Vanne, Jarno
    Hamalainen, Timo D.
    [J]. 2006 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION, PROCEEDINGS, 2006, : 1 - +
  • [39] Validation of memory accesses through symbolic analyses
    [J]. 1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (49):
  • [40] Validation of Memory Accesses Through Symbolic Analyses
    Nazare, Henrique
    Maffra, Izabela
    Santos, Willer
    Oliveira, Leonardo B.
    Quintao Pereira, Fernando Magno
    Gonnord, Laure
    [J]. ACM SIGPLAN NOTICES, 2014, 49 (10) : 791 - 809