AdCoalescer: An Adaptive Coalescer to Reduce the Inter-Module Traffic in MCM-GPUs

被引：0

作者：

Zhang, Xu ^{[1
]}

Zhang, Guangda ^{[1
]}

Wang, Lu ^{[1
]}

Zhao, Xia ^{[1
]}

机构：

[1] Acad Mil Sci, Def Innovat Inst, Beijing, Peoples R China

来源：

2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024 | 2024年

关键词：

Multi-Chip-Module (MCM) GPU; data sharing; coalescing;

D O I：

10.1109/IPDPSW63119.2024.00191

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The push for greater computing capabilities has led to the development of Multi-chip-module GPUs (MCM-GPUs), advancing parallel processing potential. Unfortunately, MCM-GPUs encounter a notable challenge, i.e., the performance bottleneck due to the inter-module network. Within MCM-GPUs, a large proportion of memory accesses by Streaming Multiprocessors (SMs) must traverse this inter-module network to access remote memory, encountering bandwidth constraints and increased latency-. This is in contrast to the efficient network-on-chip designs in single-module GPU architectures. In MCM-GPUs, we identify significant data access redundancy among SMs within a GPU module which can be coalesced to reduce the network pressure. However, directly coalescing by recording every memory address is inefficient, as a significant number of memory requests are directed to private data addresses, thus underutilizing the hardware resources. To address this challenge, we introduce the Adaptive Coalescer (AdCoalescer). AdCoalescer is a novel framework designed to adaptively coalesce memory requests from different SMs sent to the same cache lines, especially those likely to be concurrently accessed by multiple SMs. Our evaluations validate AdCoalescer design in alleviating the challenges posed by the inter-module network. On average, AdCoalescer achieves a performance improvement of 22.5% (with up to 71.9% improvement) compared to traditional designs with minimal hardware cost.

引用

页码：1159 / 1160

页数：2

共 4 条

[1] AdCoalescer: An Adaptive Coalescer to Reduce the Inter-Module Traffic in MCM-GPUs
Zhang, Xu
Zhang, Guangda
Wang, Lu
Zhang, Shiqing
Zhao, Xia
53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 1001 - 1011
[2] Adaptive Modular Robots Through Heterogeneous Inter-Module Connections
Shimizu, Masahiro
Kato, Takuma
Lungarella, Max
Ishiguro, Akio
JOURNAL OF ROBOTICS AND MECHATRONICS, 2008, 20 (03) : 386 - 393
[3] Adaptive reconfiguration of a modular robot through heterogeneous inter-module connections
Shimizu, Masahiro
Kato, Takuma
Lungarella, Max
Ishiguro, Akio
2008 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-9, 2008, : 3527 - +
[4] An Input-Series-Output-Series Modular MuItilevel DC Transformer With Inter-Module Arithmetic Phase Interleaving Control to Reduce DC Ripples
Ding, Ran
Mei, Jun
Guan, Zhou
Zhao, Jianfeng
IEEE ACCESS, 2018, 6 : 75961 - 75974

← 1 →