DNN Model Compression for IoT Domain-Specific Hardware Accelerators

被引：14

作者：

Russo, Enrico ^{[1
]}

Palesi, Maurizio ^{[1
]}

Monteleone, Salvatore ^{[2
]}

Patti, Davide ^{[1
]}

Mineo, Andrea ^{[3
]}

Ascia, Giuseppe ^{[1
]}

Catania, Vincenzo ^{[1
]}

机构：

[1] Univ Catania, Dept Elect Elect & Comp Engn, I-95125 Catania, Italy

[2] Kore Univ Enna, Fac Engn & Architecture, Dept Comp Engn, I-94100 Enna, Italy

[3] STMicroelectronics, Analog MEMs & Sensors Grp, I-95121 Catania, Italy

来源：

IEEE INTERNET OF THINGS JOURNAL | 2022年 / 9卷 / 09期

关键词：

Deep neural network (DNN) accelerator; DNN model compression; domain-specific accelerator; energy versus performance versus accuracy tradeoff; neural networks; NETWORKS;

D O I：

10.1109/JIOT.2021.3111723

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Machine learning techniques, particularly those based on neural networks, are always more often used at the edge of the network by Internet of Things (IoT) nodes. Unfortunately, the computation capabilities demanded by those applications, together with their energy efficiency-related constraints, exceed those exposed by embedded general-purpose processors. For this reason, the use of domain-specific hardware accelerators (DSAs) is considered the most viable solution to the unsustainable "Turing tariff" of general-purpose hardware. Starting from the observation that memory and communication traffic account for a large fraction of the overall latency and energy in deep neural network (DNN) inferences, this article proposes a new compression technique aimed at: 1) reducing the memory footprint for storing the model parameters of a DNN and 2) improving DNN inference latency and energy on resource-constrained IoT devices. The proposed compression technique, namely, LineCompress, is applied on a set of representative convolutional neural networks (CNNs) for object recognition mapped on a state-of-the-art DSA targeted for resource-constrained IoT devices. We show that on average, 7.4 x memory footprint reduction can be obtained, thus reducing the memory and communication traffic that result to 77% and 87% inference latency and energy reduction, respectively, trading-off efficiency versus accuracy.

引用

页码：6650 / 6662

页数：13

共 50 条

[1] Domain-Specific Hardware Accelerators
Dally, William J.
Turakhia, Yatish
Han, Song
[J]. COMMUNICATIONS OF THE ACM, 2020, 63 (07) : 48 - 57
[2] Next-Generation Domain-Specific Accelerators: From Hardware to System
Shao, Yakun Sophia
[J]. 2024 IEEE CUSTOM INTEGRATED CIRCUITS CONFERENCE, CICC, 2024,
[3] COMPRIZE: Assessing the Fusion of Quantization and Compression on DNN Hardware Accelerators
Patel, Vrajesh
Shah, Neel
Krishna, Aravind
Glint, Tom
Ronak, Abdul
Mekie, Joycee
[J]. PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 253 - 258
[4] Domain-specific XML compression
Moore, John P. T.
Kheirkhahzadeh, Antonio D.
Bagale, Jiva N.
[J]. 2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 510 - 510
[5] Code Generation from a Domain-specific Language for C-based HLS of Hardware Accelerators
Reiche, Oliver
Schmid, Moritz
Hannig, Frank
Membarth, Richard
Teich, Juergen
[J]. 2014 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2014,
[6] A DNN Protection Solution for PIM accelerators with Model Compression
Zhao, Lei
Zhang, Youtao
Yang, Jun
[J]. 2022 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2022), 2022, : 320 - 325
[7] Domain-Specific Language Abstractions for Compression
Ray, Jessica
Brahmakshatriya, Ajay
Wang, Richard
Kamil, Shoaib
Reuther, Albert
Sze, Vivienne
Amarasinghe, Saman
[J]. 2021 DATA COMPRESSION CONFERENCE (DCC 2021), 2021, : 364 - 364
[8] Domain-specific model differencing for graphical domain-specific languages
Jafarlou, Manouchehr Zadahmad
[J]. ACM/IEEE 25TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS 2022 COMPANION, 2022, : 205 - 208
[9] A Reconfigurable Platform for the Design and Verification of Domain-Specific Accelerators
Park, Sungho
Cho, Yong Cheol Peter
Irick, Kevin M.
Narayanan, Vijaykrishnan
[J]. 2012 17TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2012, : 108 - 113
[10] DSMCompare: domain-specific model differencing for graphical domain-specific languages
Manouchehr Zadahmad
Eugene Syriani
Omar Alam
Esther Guerra
Juan de Lara
[J]. Software and Systems Modeling, 2022, 21 : 2067 - 2096

← 1 2 3 4 5 →