MRPB: Memory Request Prioritization for Massively Parallel Processors

被引:0
|
作者
Jia, Wenhao [1 ]
Shaw, Kelly A. [2 ]
Martonosi, Margaret [1 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] Univ Richmond, Richmond, VA 23173 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Massively parallel, throughput-oriented systems such as graphics processing units (GPUs) offer high performance for a broad range of programs. They are, however, complex to program, especially because of their intricate memory hierarchies with multiple address spaces. In response, modern GPUs have widely adopted caches, hoping to providing smoother reductions in memory access traffic and latency. Unfortunately, GPU caches often have mixed or unpredictable performance impact due to cache contention that results from the high thread counts in GPUs. We propose the memory request prioritization buffer (MRPB) to ease GPU programming and improve GPU performance. This hardware structure improves caching efficiency of massively parallel workloads by applying two prioritization methods-request reordering and cache bypassing-to memory requests before they access a cache. MRPB then releases requests into the cache in a more cache-friendly order. The result is drastically reduced cache contention and improved use of the limited per-thread cache capacity. For a simulated 16KB L1 cache, MRPB improves the average performance of the entire PolyBench and Rodinia suites by 2.65 x and 1.27 x respectively, outperforming a state-of-the-art GPU cache management technique.
引用
收藏
页码:272 / 283
页数:12
相关论文
共 50 条
  • [41] Hierarchical stack filtering: a bitplane-based algorithm for massively parallel processors
    Andrés Frías-Velázquez
    Josep Ramon Morros
    Mario García
    Wilfried Philips
    Journal of Real-Time Image Processing, 2019, 16 : 1717 - 1730
  • [42] Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
    Klenk, Benjamin
    Froening, Holger
    Eberle, Hans
    Dennison, Larry
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 855 - 865
  • [43] Massively Parallel Computation via Remote Memory Access
    Behnezhad, Soheil
    Dhulipala, Laxman
    Esfandiari, Hossein
    Lacki, Jakub
    Mirrokni, Vahab
    Schudy, Warren
    SPAA'19: PROCEEDINGS OF THE 31ST ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURESS, 2019, 2019, : 59 - 68
  • [44] Microservers: A new memory semantics for massively parallel computing
    Brockman, Jay B.
    Kogge, Peter M.
    Freeh, Vincent W.
    Kuntz, Shannon K.
    Sterling, Thomas L.
    Proceedings of the International Conference on Supercomputing, 1999, : 454 - 463
  • [45] Massively Parallel Computation via Remote Memory Access
    Behnezhad, Soheil
    Dhulipala, Laxman
    Esfandiari, Hossein
    Lacki, Jakub
    Mirrokni, Vahab
    Schudy, Warren
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2021, 8 (03)
  • [46] EXTENDING PARALLELISM TO MEMORY HIERARCHIES IN MASSIVELY PARALLEL SYSTEMS
    ALSAQABI, KH
    DAVIS, EW
    IEE PROCEEDINGS-E COMPUTERS AND DIGITAL TECHNIQUES, 1991, 138 (04): : 193 - 202
  • [47] Enhanced memory architecture for massively parallel vision chip
    Chen Zhe
    Yang Jie
    Liu Liyuan
    Wu Nanjian
    SELECTED PAPERS FROM CONFERENCES OF THE PHOTOELECTRONIC TECHNOLOGY COMMITTEE OF THE CHINESE SOCIETY OF ASTRONAUTICS 2014, PT II, 2015, 9522
  • [48] A parallel ant colony algorithm on massively parallel processors and its convergence analysis for the travelling salesman problem
    Ling, Chen
    Sun Hai-Ying
    Shu, Wang
    INFORMATION SCIENCES, 2012, 199 : 31 - 42
  • [50] THREAD PRIORITIZATION - A THREAD SCHEDULING MECHANISM FOR MULTIPLE-CONTEXT PARALLEL PROCESSORS
    FISKE, S
    DALLY, WJ
    FUTURE GENERATION COMPUTER SYSTEMS, 1995, 11 (06) : 503 - 518