FPGA Processor In Memory Architectures (PIMs): Overlay or Overhaul ?

被引：0

作者：

Kabir, Md Arafat ^{[1
]}

Kabir, Ehsan ^{[1
]}

Hollis, Joshua ^{[1
]}

Levy-Mackay, Eli ^{[1
]}

Panahi, Atiyehsadat ^{[2
]}

Bakos, Jason ^{[3
]}

Huang, Miaoqing ^{[1
]}

Andrews, David ^{[1
]}

机构：

[1] Univ Arkansas, Dept Comp Sci & Comp Engn, Fayetteville, AR 72701 USA

[2] Univ South Carolina, Dept Comp Sci & Comp Engn, Columbia, SC 29208 USA

[3] Cadence Design Syst, Dept Comp Sci & Comp Engn, San Jose, CA USA

来源：

2023 33RD INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL | 2023年

基金：

美国国家科学基金会;

关键词：

Processing-in-Memory; Bit-serial; Overlay; FPGA; Machine Learning; SIMD;

D O I：

10.1109/FPL60245.2023.00023

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The dominance of machine learning and the ending of Moore's law have renewed interests in Processor in Memory (PIM) architectures. This interest has produced several recent proposals to modify an FPGA's BRAM architecture to form a next-generation PIM reconfigurable fabric [1], [2]. PIM architectures can also be realized within today's FPGAs as overlays without the need to modify the underlying FPGA architecture. To date, there has been no study to understand the comparative advantages of the two approaches. In this paper, we present a study that explores the comparative advantages between two proposed custom architectures and a PIM overlay running on a commodity FPGA. We created PiCaSO, a Processor in/near Memory Scalable and Fast Overlay architecture as a representative PIM overlay. The results of this study show that the PiCaSO overlay achieves up to 80% of the peak throughput of the custom designs with 2.56x shorter latency and 25% - 43% better BRAM memory utilization efficiency. We then show how several key features of the PiCaSO overlay can be integrated into the custom PIM designs to further improve their throughput by 18%, latency by 19.5%, and memory efficiency by 6.2%.

引用

页码：109 / 115

页数：7

共 50 条

[1] OVERVIEW OF A FPGA-BASED OVERLAY PROCESSOR
Yu, Yunxuan
Wu, Chen
Shi, Xiao
He, Lei
2019 CHINA SEMICONDUCTOR TECHNOLOGY INTERNATIONAL CONFERENCE (CSTIC), 2019,
[2] Extended overlay architectures for heterogeneous FPGA cluster management
Najem, Mohamad
Bollengier, Theotime
Le Lann, Jean-Christophe
Lagadec, Loic
JOURNAL OF SYSTEMS ARCHITECTURE, 2017, 78 : 1 - 14
[3] Demo: Overlay Architectures For Heterogeneous FPGA Cluster Management
Bollengier, Theotime
Najem, Mohamad
Le Lann, Jean-Christophe
Lagadec, Loic
PROCEEDINGS OF THE 2016 CONFERENCE ON DESIGN AND ARCHITECTURES FOR SIGNAL & IMAGE PROCESSING, 2016, : 239 - 240
[4] Time-Multiplexed FPGA Overlay Architectures: A Survey
Li, Xiangwei
Maskell, Douglas L.
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2019, 24 (05)
[5] OPU: An FPGA-Based Overlay Processor for Convolutional Neural Networks
Yu, Yunxuan
Wu, Chen
Zhao, Tiandong
Wang, Kun
He, Lei
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 35 - 47
[6] Design a Novel Memory Network for Processor-in-Memory Architectures
Chu, Slo-Li
Ho, Wen-Chih
Chen, Chien-Fang
Ceng, Kai-Wei
Liu, Ming-Han
2017 13TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG 2017), 2017, : 56 - 61
[7] FPGA SAR processor with window memory accesses
Dou, Yong
Zhou, Jie
Lei, Yuanwu
Zhou, Xingming
2007 IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, 2007, : 95 - 100
[8] Implementing logic in FPGA memory arrays: Heterogeneous memory architectures
Wilton, SJE
2002 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), PROCEEDINGS, 2002, : 142 - 147
[9] Transformer-OPU: An FPGA-based Overlay Processor for Transformer Networks
Bai, Yueyin
Zhou, Hao
Zhao, Keqing
Chen, Jianli
Yu, Jun
Wang, Kun
2023 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM, 2023, : 222 - 222
[10] Effects of memory lateneies on nonblocking processor/cache architectures
1600, ACM SIGARCH (Publ by ACM, New York, NY, USA):

← 1 2 3 4 5 →