SAMBA: Sparsity Aware In-Memory Computing Based Machine Learning Accelerator

被引：1

作者：

Kim, Dong Eun ^{[1
]}

Ankit, Aayush ^{[2
]}

Wang, Cheng ^{[3
]}

Roy, Kaushik ^{[1
]}

机构：

[1] Purdue Univ, Dept Elect & Comp Engn, W Lafayette, IN 47907 USA

[2] Microsoft, San Jose, CA 95112 USA

[3] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2023年 / 72卷 / 09期

基金：

美国国家科学基金会;

关键词：

Computer architecture; Optimization; Load management; Convolution; In-memory computing; Hardware; Energy consumption; Accelerator; in-memory computing; neural networks; sparsity; COPROCESSOR;

D O I：

10.1109/TC.2023.3257513

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Machine Learning (ML) inference is typically dominated by highly data-intensive Matrix Vector Multiplication (MVM) computations that may be constrained by memory bottleneck due to massive data movement between processor and memory. Although analog in-memory computing (IMC) ML accelerators have been proposed to execute MVM with high efficiency, the latency and energy of such computing systems can be dominated by the large latency and energy costs from analog-to-digital converters (ADCs). Leveraging sparsity in ML workloads, reconfigurable ADCs can save MVM energy and latency by reducing the required ADC bit precision. However, such improvement in latency can be hindered by non-uniform sparsity of the weight matrices mapped into hardware. Moreover, data movement between MVM processing cores may become another factor that delays the overall system-level performance. To address these issues, we propose SAMBA, Sparsity Aware IMC Based Machine Learning Accelerator. First, we propose load balancing during mapping of weight matrices into physical crossbars to eliminate non-uniformity in the sparsity of mapped matrices. Second, we propose optimizations in arranging and scheduling the tiled MVM hardware to minimize the overhead of data movement across multiple processing cores. Our evaluations show that the proposed load balancing technique can achieve performance improvement. The proposed optimizations can further improve both performance and energy-efficiency regardless of sparsity condition. With the combination of load balancing and data movement optimization in conjunction with reconfigurable ADCs, our proposed approach provides up to 2.38x speed-up and 1.54x energy-efficiency over stateof- art analog IMC based ML accelerators for ImageNet datasets on Resnet-50 architecture.

引用

页码：2615 / 2627

页数：13

共 50 条

[21] Sparsity-Aware In-Memory Neuromorphic Computing Unit With Configurable Topology of Hybrid Spiking and Artificial Neural Network
Liu, Ying
Chen, Zhiyuan
Zhao, Wentao
Zhao, Tianhao
Jia, Tianyu
Wang, Zhixuan
Huang, Ru
Ye, Le
Ma, Yufei
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (06) : 2660 - 2673
[22] Broad-Purpose In-Memory Computing for Signal Monitoring and Machine Learning Workloads
Jain, Saurabh
Lin, Longyang
Alioto, Massimo
[J]. IEEE SOLID-STATE CIRCUITS LETTERS, 2020, 3 : 394 - 397
[23] An in-memory computing architecture based on a duplex two-dimensional material structure for in situ machine learning
Hongkai Ning
Zhihao Yu
Qingtian Zhang
Hengdi Wen
Bin Gao
Yun Mao
Yuankun Li
Ying Zhou
Yue Zhou
Jiewei Chen
Lei Liu
Wenfeng Wang
Taotao Li
Yating Li
Wanqing Meng
Weisheng Li
Yun Li
Hao Qiu
Yi Shi
Yang Chai
Huaqiang Wu
Xinran Wang
[J]. Nature Nanotechnology, 2023, 18 : 493 - 500
[24] An in-memory computing architecture based on a duplex two-dimensional material structure for in situ machine learning
Ning, Hongkai
Yu, Zhihao
Zhang, Qingtian
Wen, Hengdi
Gao, Bin
Mao, Yun
Li, Yuankun
Zhou, Ying
Zhou, Yue
Chen, Jiewei
Liu, Lei
Wang, Wenfeng
Li, Taotao
Li, Yating
Meng, Wanqing
Li, Weisheng
Li, Yun
Qiu, Hao
Shi, Yi
Chai, Yang
Wu, Huaqiang
Wang, Xinran
[J]. NATURE NANOTECHNOLOGY, 2023, 18 (05) : 493 - +
[25] MRIMA: An MRAM-Based In-Memory Accelerator
Angizi, Shaahin
He, Zhezhi
Awad, Amro
Fan, Deliang
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (05) : 1123 - 1136
[26] In-Memory Computing Accelerators for Emerging Learning Paradigms
Reis, Dayane
Laguna, Ann Franchesca
Niemier, Michael
Hu, Xiaobo Sharon
[J]. 2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC, 2023, : 606 - 611
[27] AI: From Deep Learning to In-Memory Computing
Lung, Hsiang-Lan
[J]. METROLOGY, INSPECTION, AND PROCESS CONTROL FOR MICROLITHOGRAPHY XXXIII, 2019, 10959
[28] Ferroelectric FET based In-Memory Computing for Few-Shot Learning
Laguna, Ann Franchesca
Yin, Xunzhao
Reis, Dayane
Niemier, Michael
Hu, X. Sharon
[J]. GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 373 - 378
[29] Fully Binarized Graph Convolutional Network Accelerator Based on In-Memory Computing with Resistive Random-Access Memory
Zhang, Woyu
Li, Zhi
Zhang, Xinyuan
Wang, Fei
Wang, Shaocong
Lin, Ning
Li, Yi
Wang, Jun
Yue, Jinshan
Dou, Chunmeng
Xu, Xiaoxin
Wang, Zhongrui
Shang, Dashan
[J]. ADVANCED INTELLIGENT SYSTEMS, 2024, 6 (07)
[30] Ferroelectric capacitors and field-effect transistors as in-memory computing elements for machine learning workloads
Yu, Eunseon
Kumar, Gaurav K.
Saxena, Utkarsh
Roy, Kaushik
[J]. SCIENTIFIC REPORTS, 2024, 14 (01):

← 1 2 3 4 5 →