Software-Hardware Co-Optimization on Partial-Sum Problem for PIM-based Neural Network Accelerator

被引：0

作者：

Wu, Qizhe ^{[1
]}

Tao, Linfeng ^{[1
]}

Liang, Huawen ^{[1
]}

Yuan, Wei ^{[1
]}

Tian, Teng ^{[1
]}

Xue, Shuang ^{[1
]}

Jin, Xi ^{[1
]}

机构：

[1] Univ Sci & Technol China, Chinese Acad Sci, Dept Phys, State Key Lab Particle Detect & Elect,Inst Microe, Hefei 230026, Peoples R China

来源：

2021 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC) | 2021年

关键词：

processing-in-memory; partial sum; memristor; neural network accelerator;

D O I：

10.1109/HPEC49654.2021.9622798

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The crossbar architecture, which is comprised of novel mcmristor devices, enables high-speed and energy-efficient processing-in-memory (PIM) for neural network computing. However, because to the limitations of the manufacturing process, it is difficult to fabricate huge arrays. As a consequence, the neural network's vector-matrix-multiplication (VMM) must split the operands into several arrays to get the partial-sum and then add up the partial results. The neural network (NN) training process, which is often influenced by device variations and ADC quantization noise in the PIM system, does not perceive the partial-sum process. As a consequence, when inferring NN models directly on the PIM platform without taking partial-sum into account, accuracy suffers significantly. This makes it difficult to apply PIM computing to large-scale neural networks. In particular, our work makes the following contributions: (i) We conducted research on the partial-sum issue for crossbar architecture while computing high channel convolution (Cony), and got three lessons as a result. (ii) To address this issue, we offer techniques for avoiding or minimizing partial-sum at the software and hardware levels, respectively. At the software level, we utilized group Cony rather than conventional Cony; at the hardware level, we presented a new architecture for adapting dcpthwise separable Cony. Experiments were conducted using the Cifar10 dataset and the VGG8 network on RRAM crossbar architecture. Results show improvements of 15.53%, 14.55% in accuracy, and 0.28x, 0.94x in energy efficiency on software and hardware levels, respectively, when compared to the conventional PIM scheme.

引用

页数：7

共 36 条

[21] Hardware and Software Co-optimization of Convolutional and Self-attention Combined Model Based on FPGA
Hu, Wei
Li, Heyuan
Liu, Fang
Zhong, Zhiyv
WEB AND BIG DATA, PT III, APWEB-WAIM 2023, 2024, 14333 : 328 - 342
[22] HSCONN: Hardware-Software Co-Optimization of Self-Attention Neural Networks for Large Language Models
Liu, Siqin
Kuve, Prakash Chand
Karanth, Avinash
PROCEEDING OF THE GREAT LAKES SYMPOSIUM ON VLSI 2024, GLSVLSI 2024, 2024, : 736 - 741
[23] Hardware and Software Co-optimization for the Initialization Failure of the ReRAM-based Cross-bar Array
Kim, Youngseok
Kim, Seyoung
Yeh, Chun-Chen
Narayanan, Vijay
Choi, Jungwook
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2020, 16 (04)
[24] Multi-Objective Surrogate-Model-Based Neural Architecture and Physical Design Co-Optimization of Energy Efficient Neural Network Hardware Accelerators
Wohrle, Hendrik
Schneider, Felix
Schlenke, Fabian
Lebold, Denis
Alvarez, Mariela De Lucas
Kirchner, Frank
Karagounis, Michael
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (01) : 40 - 53
[25] Hardware/Software Co-design for a Neural Network Trained by Particle Swarm Optimization Algorithm
Tuan Linh Dang
Hoshino, Yukinobu
NEURAL PROCESSING LETTERS, 2019, 49 (02) : 481 - 505
[26] Hardware/Software Co-design for a Neural Network Trained by Particle Swarm Optimization Algorithm
Tuan Linh Dang
Yukinobu Hoshino
Neural Processing Letters, 2019, 49 : 481 - 505
[27] Relation network inference optimization method based on software and hardware co-acceleration
Zhang Z.
Wang J.
Zhang L.
Xiao J.
High Technology Letters, 2022, 32 (04) : 327 - 336
[28] EIT-MP: A 2-D Electrical Impedance Tomography Image Reconstruction Method Based on Mixed Precision Asymmetrical Neural Network for Hardware-Software Co-Optimization Platform
Huang, Jiajie
Guo, Qianyu
Zhang, Yunxiang
Lu, Wangzilu
Wang, Chao
Zhang, Wenkai
Liu, Wentao
Zhao, Jian
Li, Yongfu
IEEE SENSORS JOURNAL, 2024, 24 (23) : 39947 - 39957
[29] A genetic algorithm based approach for multi-objective hardware/software co-optimization (vol 10, pg 36, 2016)
Banerjee, Tania
Gadou, Mohamed
Ranka, Sanjay
SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2016, 12 : 55 - 55
[30] Graph neural network based cell library characterization method for fast design technology co-optimization
Ma, Tianliang
Fan, Guangxi
Sun, Xuguang
Low, Kain Lu
Shao, Leilai
INTEGRATION-THE VLSI JOURNAL, 2025, 101

← 1 2 3 4 →