Tightly Coupled Machine Learning Coprocessor Architecture With Analog In-Memory Computing for Instruction-Level Acceleration

被引：4

作者：

Chung, SungWon ^{[1
]}

Wang, Jiemi ^{[2
]}

机构：

[1] Univ Southern Calif, Dept Elect Engn, Los Angeles, CA 90089 USA

[2] Samsung Austin R&D Ctr, Austin, TX 78746 USA

来源：

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS | 2019年 / 9卷 / 03期

关键词：

Machine learning hardware accelerator; programmable accelerator; approximate analog computing; in-memory computing; analog datapath; analog register file; switched capacitor circuit; tightly coupled coprocessor; deep learning; SAR ADC; PROCESSOR; NETWORK; STORAGE; CMOS; MS/S; SRAM;

D O I：

10.1109/JETCAS.2019.2934929

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Low-profile mobile computing platforms often need to execute a variety of machine learning algorithms with limited memory and processing power. To address this challenge, this work presents Coara, an instruction-level processor acceleration architecture, which efficiently integrates an approximate analog in-memory computing coprocessor for accelerating general machine learning applications by exploiting analog register file cache. The instruction-level acceleration offers true programmability beyond the degree of freedom provided by reconfigurable machine learning accelerators, and also allows the code generation stage of a compiler back-end to control the coprocessor execution and data flow, so that applications do not need high-level machine learning software frameworks with a large memory footprint. Conventional analog and mixed-signal accelerators suffer from the overhead of frequent data conversion between analog and digital signals. To solve this classical problem, Coara uses an analog register file cache, which interfaces the analog in-memory computing coprocessor with the digital register file of the processor core. As a result, more than 90% of data conversion overhead with ADC and DAC can be eliminated by temporarily storing the result of analog computation in a switched-capacitor analog memory cell until data dependency occurs. Cycle-accurate Verilog RTL model of the proposed architecture is evaluated with 45 nm CMOS technology parameters while executing machine learning benchmark computation codes that are generated by a customized cross-compiler without using machine learning software frameworks.

引用

页码：544 / 561

页数：18

共 30 条

[1] Deep learning acceleration based on in-memory computing
Eleftheriou, E.
Le Gallo, M.
Nandakumar, S. R.
Piveteau, C.
Boybat, I
Joshi, V
Khaddam-Aljameh, R.
Dazzi, M.
Giannopoulos, I
Karunaratne, G.
Kersting, B.
Stanisavljevic, M.
Jonnalagadda, V. P.
Ioannou, N.
Kourtis, K.
Francese, P. A.
Sebastian, A.
[J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2019, 63 (06)
[2] In-Memory Computing for Machine Learning and Deep Learning
Lepri, N.
Glukhov, A.
Cattaneo, L.
Farronato, M.
Mannocci, P.
Ielmini, D.
[J]. IEEE JOURNAL OF THE ELECTRON DEVICES SOCIETY, 2023, 11 : 587 - 601
[3] Evaluating an Analog Main Memory Architecture for All-Analog In-Memory Computing Accelerators
Adam, Kazybek
Monga, Dipesh
Numan, Omar
Singh, Gaurav
Halonen, Kari
Andraud, Martin
[J]. 2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 248 - 252
[4] In-Memory Computing in Emerging Memory Technologies for Machine Learning: An Overview
Roy, Kaushik
Chakraborty, Indranil
Ali, Mustafa
Ankit, Aayush
Agrawal, Amogh
[J]. PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[5] Instruction-level acceleration for drawing operations of Android system based on domestic Unicore architecture
[J]. Ling, M. (trio@seu.eud.cn), 1600, Shanghai Jiaotong University (47):
[6] A review of in-memory computing for machine learning: architectures, options
Snasel, Vaclav
Dang, Tran Khanh
Kueng, Josef
Kong, Lingping
[J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2024, 20 (01) : 24 - 47
[7] ALPINE: Analog In-Memory Acceleration With Tight Processor Integration for Deep Learning
Klein, Joshua
Boybat, Irem
Qureshi, Yasir Mahmood
Dazzi, Martino
Levisse, Alexandre
Ansaloni, Giovanni
Zapater, Marina
Sebastian, Abu
Atienza, David
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (07) : 1985 - 1998
[8] An Energy Efficient In-Memory Computing Machine Learning Classifier Scheme
Jiang, Shixiong
Priya, Sheena Ratnam
Elango, Naveena
Clay, James
Sridhar, Ramalingam
[J]. 2019 32ND INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2019 18TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID), 2019, : 157 - 162
[9] In-Memory Computing Architectures for Big Data and Machine Learning Applications
Snasel, Vaclav
Tran Khanh Dang
Pham, Phuong N. H.
Kueng, Josef
Kong, Lingping
[J]. FUTURE DATA AND SECURITY ENGINEERING. BIG DATA, SECURITY AND PRIVACY, SMART CITY AND INDUSTRY 4.0 APPLICATIONS, FDSE 2022, 2022, 1688 : 19 - 33
[10] In-Memory Computing based Machine Learning Accelerators: Opportunities and Challenges
Roy, Kaushik
[J]. PROCEEDINGS OF THE 32ND GREAT LAKES SYMPOSIUM ON VLSI 2022, GLSVLSI 2022, 2022, : 203 - 204

← 1 2 3 →