Exploiting Direct Memory Operands in GPU Instructions

被引:0
|
作者
Mohammadpur-Fard, Ali [1 ]
Darabi, Sina [2 ,3 ]
Falahati, Hajar [2 ,4 ]
Mahani, Negin [2 ,4 ,5 ]
Sarbazi-Azad, Hamid [1 ,2 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran 111559466, Iran
[2] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran 195383351, Iran
[3] Univ Svizzera Italiana USI, Fac Informat, CH-6900 Lugano, Switzerland
[4] Barcelona Supercomp Ctr BSC, Barcelona 08034, Spain
[5] Shahid Bahonar Univ, Dept Comp Engn, Higher Educ Complex Zarand, Kerman 761691411, Iran
关键词
Registers; Graphics processing units; Computer architecture; Reduced instruction set computing; Arithmetic; Hardware; Standards; CISC; GPGPU; RISC; register file;
D O I
10.1109/LCA.2024.3371062
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
GPUs are widely used for diverse applications, particularly data-parallel tasks like machine learning and scientific computing. However, their efficiency is hindered by architectural limitations, inherited from historical RISC processors, in handling memory loads causing high register file contention. We observe that a significant number (around 26%) of values present in the register file are typically used only once, contributing to more than 25% of the total register file bank conflicts, on average. This paper addresses the challenge of single-use memory values in the GPU register file (i.e. data values used only once) which wastes space and increases latency. To this end, we introduce a novel mechanism inspired by CISC architectures. It replaces single-use loads with direct memory operands in arithmetic operations. Our approach improves performance by 20% and reduces energy consumption by 18%, on average, with negligible (<1%) hardware overhead.
引用
收藏
页码:162 / 165
页数:4
相关论文
共 50 条
  • [41] Lightweight Hardware Transactional Memory for GPU Scratchpad Memory
    Villegas, Alejandro
    Asenjo, Rafael
    Navarro, Angeles
    Plata, Oscar
    Kaeli, David
    IEEE TRANSACTIONS ON COMPUTERS, 2018, 67 (06) : 816 - 829
  • [42] Exploiting the prefetching effect provided by executing mispredicted load instructions
    Sendag, R
    Lilja, DJ
    Kunke, SR
    EURO-PAR 2002 PARALLEL PROCESSING, PROCEEDINGS, 2002, 2400 : 468 - 480
  • [43] Exploiting conditional instructions in code generation for embedded VLIW processors
    Leupers, R
    DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION 1999, PROCEEDINGS, 1999, : 105 - 109
  • [44] Selective Memory Compression for GPU Memory Oversubscription Management
    Nihaal, Abdun
    Mutyam, Madhu
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 189 - 198
  • [45] IMPLICIT MEMORY AND THE ENACTMENT OF VERBAL INSTRUCTIONS
    NILSSON, LG
    BACKMAN, L
    IMPLICIT MEMORY: THEORETICAL ISSUES, 1989, : 173 - 183
  • [46] Controlling working memory with learned instructions
    Sylvester, J. C.
    Reggia, J. A.
    Weems, S. A.
    Bunting, M. F.
    NEURAL NETWORKS, 2013, 41 : 23 - 38
  • [47] In-Memory Computing With Double Word Lines and Three Read Ports for Four Operands
    Lin, Zhiting
    Zhan, Honglan
    Li, Xuan
    Peng, Chunyu
    Lu, Wenjuan
    Wu, Xiulong
    Chen, Junning
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (05) : 1316 - 1320
  • [48] Bit memory instructions for a general CPU
    Olausson, M
    Edman, A
    Liu, D
    4TH IEEE INTERNATIONAL WORKSHOP ON SYSTEM-ON-CHIP FOR REAL-TIME APPLICATIONS, PROCEEDINGS, 2004, : 215 - 218
  • [49] Exploiting Data Compression to Mitigate Aging in GPU Register Files
    Candel, Francisco
    Valero, Alejandro
    Petit, Salvador
    Suarez-Gracia, Dario
    Sahuquillo, Julio
    2017 29TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2017, : 57 - 64
  • [50] Stealing Webpages Rendered on Your Browser by Exploiting GPU Vulnerabilities
    Lee, Sangho
    Kim, Youngsok
    Kim, Jangwoo
    Kim, Jong
    2014 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2014), 2014, : 19 - 33