SAMBA: Sparsity Aware In-Memory Computing Based Machine Learning Accelerator

被引:1
|
作者
Kim, Dong Eun [1 ]
Ankit, Aayush [2 ]
Wang, Cheng [3 ]
Roy, Kaushik [1 ]
机构
[1] Purdue Univ, Dept Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Microsoft, San Jose, CA 95112 USA
[3] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
Computer architecture; Optimization; Load management; Convolution; In-memory computing; Hardware; Energy consumption; Accelerator; in-memory computing; neural networks; sparsity; COPROCESSOR;
D O I
10.1109/TC.2023.3257513
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Machine Learning (ML) inference is typically dominated by highly data-intensive Matrix Vector Multiplication (MVM) computations that may be constrained by memory bottleneck due to massive data movement between processor and memory. Although analog in-memory computing (IMC) ML accelerators have been proposed to execute MVM with high efficiency, the latency and energy of such computing systems can be dominated by the large latency and energy costs from analog-to-digital converters (ADCs). Leveraging sparsity in ML workloads, reconfigurable ADCs can save MVM energy and latency by reducing the required ADC bit precision. However, such improvement in latency can be hindered by non-uniform sparsity of the weight matrices mapped into hardware. Moreover, data movement between MVM processing cores may become another factor that delays the overall system-level performance. To address these issues, we propose SAMBA, Sparsity Aware IMC Based Machine Learning Accelerator. First, we propose load balancing during mapping of weight matrices into physical crossbars to eliminate non-uniformity in the sparsity of mapped matrices. Second, we propose optimizations in arranging and scheduling the tiled MVM hardware to minimize the overhead of data movement across multiple processing cores. Our evaluations show that the proposed load balancing technique can achieve performance improvement. The proposed optimizations can further improve both performance and energy-efficiency regardless of sparsity condition. With the combination of load balancing and data movement optimization in conjunction with reconfigurable ADCs, our proposed approach provides up to 2.38x speed-up and 1.54x energy-efficiency over stateof- art analog IMC based ML accelerators for ImageNet datasets on Resnet-50 architecture.
引用
收藏
页码:2615 / 2627
页数:13
相关论文
共 50 条
  • [1] In-Memory Computing for Machine Learning and Deep Learning
    Lepri, N.
    Glukhov, A.
    Cattaneo, L.
    Farronato, M.
    Mannocci, P.
    Ielmini, D.
    [J]. IEEE JOURNAL OF THE ELECTRON DEVICES SOCIETY, 2023, 11 : 587 - 601
  • [2] In-Memory Computing based Machine Learning Accelerators: Opportunities and Challenges
    Roy, Kaushik
    [J]. PROCEEDINGS OF THE 32ND GREAT LAKES SYMPOSIUM ON VLSI 2022, GLSVLSI 2022, 2022, : 203 - 204
  • [3] MULTIFUNCTIONAL RRAM CHIP WITH CONFIGURABILITY FOR SPARSITY-AWARE IN-MEMORY ISNG MACHINE
    Yue, Wenshuo
    Jing, Zhaokun
    Yan, Bonan
    Tao, Yaoyu
    Zhang, Teng
    Huang, Ru
    Yang, Yuchao
    [J]. CONFERENCE OF SCIENCE & TECHNOLOGY FOR INTEGRATED CIRCUITS, 2024 CSTIC, 2024,
  • [4] In-Memory Computing in Emerging Memory Technologies for Machine Learning: An Overview
    Roy, Kaushik
    Chakraborty, Indranil
    Ali, Mustafa
    Ankit, Aayush
    Agrawal, Amogh
    [J]. PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [5] Circuits and Architectures for In-Memory Computing-Based Machine Learning Accelerators
    Ankit, Aayush
    Chakraborty, Indranil
    Agrawal, Amogh
    Ali, Mustafa
    Roy, Kaushik
    [J]. IEEE MICRO, 2020, 40 (06) : 8 - 21
  • [6] A review of in-memory computing for machine learning: architectures, options
    Snasel, Vaclav
    Dang, Tran Khanh
    Kueng, Josef
    Kong, Lingping
    [J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2024, 20 (01) : 24 - 47
  • [7] In-Memory Computing based Accelerator for Transformer Networks for Long Sequences
    Laguna, Ann Franchesca
    Kazemi, Arman
    Niemier, Michael
    Hu, X. Sharon
    [J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1839 - 1844
  • [8] CMOS Annealing Machine: an In-memory Computing Accelerator to Process Combinatorial Optimization Problems
    Yamaoka, Masanao
    Okuyama, Takuya
    Hayashi, Masato
    Yoshimura, Chihiro
    Takemoto, Takashi
    [J]. 2019 IEEE CUSTOM INTEGRATED CIRCUITS CONFERENCE (CICC), 2019,
  • [9] Deep learning acceleration based on in-memory computing
    Eleftheriou, E.
    Le Gallo, M.
    Nandakumar, S. R.
    Piveteau, C.
    Boybat, I
    Joshi, V
    Khaddam-Aljameh, R.
    Dazzi, M.
    Giannopoulos, I
    Karunaratne, G.
    Kersting, B.
    Stanisavljevic, M.
    Jonnalagadda, V. P.
    Ioannou, N.
    Kourtis, K.
    Francese, P. A.
    Sebastian, A.
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2019, 63 (06)
  • [10] In-Memory Computing Architectures for Big Data and Machine Learning Applications
    Snasel, Vaclav
    Tran Khanh Dang
    Pham, Phuong N. H.
    Kueng, Josef
    Kong, Lingping
    [J]. FUTURE DATA AND SECURITY ENGINEERING. BIG DATA, SECURITY AND PRIVACY, SMART CITY AND INDUSTRY 4.0 APPLICATIONS, FDSE 2022, 2022, 1688 : 19 - 33