Kernel Shape Control for Row-Efficient Convolution on Processing-In-Memory Arrays

被引:2
|
作者
Rhe, Johnny [1 ]
Jeon, Kang Eun [1 ]
Lee, Joo Chan [2 ]
Jeong, Seongmoon [2 ]
Ko, Jong Hwan [3 ]
机构
[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon, South Korea
[2] Sungkyunkwan Univ, Dept Artificial Intelligence, Suwon, South Korea
[3] Sungkyunkwan Univ, Coll Informat & Commun Engn, Suwon, South Korea
基金
新加坡国家研究基金会;
关键词
processing-in-memory; shift and duplicate (SDK) weight mapping; weight pruning; neural compression; ARCHITECTURE; PRECISION;
D O I
10.1109/ICCAD57390.2023.10323749
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Processing-in-memory (PIM) architectures have been highlighted as one of the viable solutions for faster and more power-efficient convolutional neural networks (CNNs) inference. Recently, shift and duplicate kernel (SDK) convolutional weight mapping scheme was proposed, achieving up to 50% throughput improvement over the prior arts. However, the traditional pattern-based pruning methods, which were adopted for row-skipping and computing cycle reduction, are not optimal for the latest SDK mapping due to structural irregularity caused by the shifted and duplicated kernels. To address this issue, we propose a method called kernel shape control (KERNTROL) that aims to promote structural regularity for achieving a high row-skipping ratio and model accuracy. Instead of pruning certain weight elements permanently, KERNTROL controls the kernel shapes through the omission of certain weights based on their mapped columns. In comparison to the latest pattern-based pruning approaches, KERNTROL achieves up to 36.4% improvement in the compression rate, and 38.6% in array utilization with maintaining the original model accuracy.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory
    Chen, Jinfan
    Gomez-Luna, Juan
    El Hajj, Izzat
    Guo, Yuxin
    Mutlu, Onur
    2023 32ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT, 2023, : 99 - 111
  • [2] TransPimLib: Efficient Transcendental Functions for Processing-in-Memory Systems
    Item, Maurus
    Gomez-Luna, Juan
    Guo, Yuxin
    Oliveira, Geraldo F.
    Sadrosadati, Mohammad
    Mutlu, Onur
    2023 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS, 2023, : 235 - 247
  • [3] LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory
    Boroumand, Amirali
    Ghose, Saugata
    Patel, Minesh
    Hassan, Hasan
    Lucia, Brandon
    Hsieh, Kevin
    Malladi, Krishna T.
    Zheng, Hongzhong
    Mutlu, Onur
    IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 46 - 50
  • [4] Towards Memory-Efficient Allocation of CNNs on Processing-in-Memory Architecture
    Wang, Yi
    Chen, Weixuan
    Yang, Jing
    Li, Tao
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (06) : 1428 - 1441
  • [5] Towards Memory-Efficient Processing-in-Memory Architecture for Convolutional Neural Networks
    Wang, Yi
    Zhang, Mingxu
    Yang, Jing
    ACM SIGPLAN NOTICES, 2017, 52 (05) : 81 - 90
  • [6] vPIM: Efficient Virtual Address Translation for Scalable Processing-in-Memory Architectures
    Fatima, Amel
    Liu, Sihang
    Seemakhup, Korakit
    Ausavarungnirun, Rachata
    Khan, Samira
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [7] An area-efficient and protected network interface for processing-in-memory systems
    Mediratta, SD
    Steele, C
    Sondeen, J
    Draper, J
    2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 2951 - 2954
  • [8] RRAM based processing-in-memory for efficient intelligent vision tasks at the edge
    Kumar, Ashwani
    Bezugam, Sai Sukruth
    Memories - Materials, Devices, Circuits and Systems, 2024, 8
  • [9] Design Considerations for Efficient Deep Neural Networks on Processing-in-Memory Accelerators
    Yang, Tien-Ju
    Sze, Vivienne
    2019 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2019,
  • [10] Neural-PIM: Efficient Processing-In-Memory With Neural Approximation of Peripherals
    Cao, Weidong
    Zhao, Yilong
    Boloor, Adith
    Han, Yinhe
    Zhang, Xuan
    Jiang, Li
    IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (09) : 2142 - 2155