Kernel Shape Control for Row-Efficient Convolution on Processing-In-Memory Arrays

被引：2

作者：

Rhe, Johnny ^{[1
]}

Jeon, Kang Eun ^{[1
]}

Lee, Joo Chan ^{[2
]}

Jeong, Seongmoon ^{[2
]}

Ko, Jong Hwan ^{[3
]}

机构：

[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon, South Korea

[2] Sungkyunkwan Univ, Dept Artificial Intelligence, Suwon, South Korea

[3] Sungkyunkwan Univ, Coll Informat & Commun Engn, Suwon, South Korea

来源：

2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD | 2023年

基金：

新加坡国家研究基金会;

关键词：

processing-in-memory; shift and duplicate (SDK) weight mapping; weight pruning; neural compression; ARCHITECTURE; PRECISION;

D O I：

10.1109/ICCAD57390.2023.10323749

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Processing-in-memory (PIM) architectures have been highlighted as one of the viable solutions for faster and more power-efficient convolutional neural networks (CNNs) inference. Recently, shift and duplicate kernel (SDK) convolutional weight mapping scheme was proposed, achieving up to 50% throughput improvement over the prior arts. However, the traditional pattern-based pruning methods, which were adopted for row-skipping and computing cycle reduction, are not optimal for the latest SDK mapping due to structural irregularity caused by the shifted and duplicated kernels. To address this issue, we propose a method called kernel shape control (KERNTROL) that aims to promote structural regularity for achieving a high row-skipping ratio and model accuracy. Instead of pruning certain weight elements permanently, KERNTROL controls the kernel shapes through the omission of certain weights based on their mapped columns. In comparison to the latest pattern-based pruning approaches, KERNTROL achieves up to 36.4% improvement in the compression rate, and 38.6% in array utilization with maintaining the original model accuracy.

引用

页数：9

共 50 条

[1] SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory
Chen, Jinfan
Gomez-Luna, Juan
El Hajj, Izzat
Guo, Yuxin
Mutlu, Onur
2023 32ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT, 2023, : 99 - 111
[2] TransPimLib: Efficient Transcendental Functions for Processing-in-Memory Systems
Item, Maurus
Gomez-Luna, Juan
Guo, Yuxin
Oliveira, Geraldo F.
Sadrosadati, Mohammad
Mutlu, Onur
2023 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS, 2023, : 235 - 247
[3] LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory
Boroumand, Amirali
Ghose, Saugata
Patel, Minesh
Hassan, Hasan
Lucia, Brandon
Hsieh, Kevin
Malladi, Krishna T.
Zheng, Hongzhong
Mutlu, Onur
IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 46 - 50
[4] Towards Memory-Efficient Allocation of CNNs on Processing-in-Memory Architecture
Wang, Yi
Chen, Weixuan
Yang, Jing
Li, Tao
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (06) : 1428 - 1441
[5] Towards Memory-Efficient Processing-in-Memory Architecture for Convolutional Neural Networks
Wang, Yi
Zhang, Mingxu
Yang, Jing
ACM SIGPLAN NOTICES, 2017, 52 (05) : 81 - 90
[6] vPIM: Efficient Virtual Address Translation for Scalable Processing-in-Memory Architectures
Fatima, Amel
Liu, Sihang
Seemakhup, Korakit
Ausavarungnirun, Rachata
Khan, Samira
2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
[7] An area-efficient and protected network interface for processing-in-memory systems
Mediratta, SD
Steele, C
Sondeen, J
Draper, J
2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 2951 - 2954
[8] RRAM based processing-in-memory for efficient intelligent vision tasks at the edge
Kumar, Ashwani
Bezugam, Sai Sukruth
Memories - Materials, Devices, Circuits and Systems, 2024, 8
[9] Design Considerations for Efficient Deep Neural Networks on Processing-in-Memory Accelerators
Yang, Tien-Ju
Sze, Vivienne
2019 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2019,
[10] Neural-PIM: Efficient Processing-In-Memory With Neural Approximation of Peripherals
Cao, Weidong
Zhao, Yilong
Boloor, Adith
Han, Yinhe
Zhang, Xuan
Jiang, Li
IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (09) : 2142 - 2155

← 1 2 3 4 5 →