DNN Model Compression for IoT Domain-Specific Hardware Accelerators

被引:14
|
作者
Russo, Enrico [1 ]
Palesi, Maurizio [1 ]
Monteleone, Salvatore [2 ]
Patti, Davide [1 ]
Mineo, Andrea [3 ]
Ascia, Giuseppe [1 ]
Catania, Vincenzo [1 ]
机构
[1] Univ Catania, Dept Elect Elect & Comp Engn, I-95125 Catania, Italy
[2] Kore Univ Enna, Fac Engn & Architecture, Dept Comp Engn, I-94100 Enna, Italy
[3] STMicroelectronics, Analog MEMs & Sensors Grp, I-95121 Catania, Italy
关键词
Deep neural network (DNN) accelerator; DNN model compression; domain-specific accelerator; energy versus performance versus accuracy tradeoff; neural networks; NETWORKS;
D O I
10.1109/JIOT.2021.3111723
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning techniques, particularly those based on neural networks, are always more often used at the edge of the network by Internet of Things (IoT) nodes. Unfortunately, the computation capabilities demanded by those applications, together with their energy efficiency-related constraints, exceed those exposed by embedded general-purpose processors. For this reason, the use of domain-specific hardware accelerators (DSAs) is considered the most viable solution to the unsustainable "Turing tariff" of general-purpose hardware. Starting from the observation that memory and communication traffic account for a large fraction of the overall latency and energy in deep neural network (DNN) inferences, this article proposes a new compression technique aimed at: 1) reducing the memory footprint for storing the model parameters of a DNN and 2) improving DNN inference latency and energy on resource-constrained IoT devices. The proposed compression technique, namely, LineCompress, is applied on a set of representative convolutional neural networks (CNNs) for object recognition mapped on a state-of-the-art DSA targeted for resource-constrained IoT devices. We show that on average, 7.4 x memory footprint reduction can be obtained, thus reducing the memory and communication traffic that result to 77% and 87% inference latency and energy reduction, respectively, trading-off efficiency versus accuracy.
引用
收藏
页码:6650 / 6662
页数:13
相关论文
共 50 条
  • [1] Domain-Specific Hardware Accelerators
    Dally, William J.
    Turakhia, Yatish
    Han, Song
    [J]. COMMUNICATIONS OF THE ACM, 2020, 63 (07) : 48 - 57
  • [2] Next-Generation Domain-Specific Accelerators: From Hardware to System
    Shao, Yakun Sophia
    [J]. 2024 IEEE CUSTOM INTEGRATED CIRCUITS CONFERENCE, CICC, 2024,
  • [3] COMPRIZE: Assessing the Fusion of Quantization and Compression on DNN Hardware Accelerators
    Patel, Vrajesh
    Shah, Neel
    Krishna, Aravind
    Glint, Tom
    Ronak, Abdul
    Mekie, Joycee
    [J]. PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 253 - 258
  • [4] Domain-specific XML compression
    Moore, John P. T.
    Kheirkhahzadeh, Antonio D.
    Bagale, Jiva N.
    [J]. 2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 510 - 510
  • [5] Code Generation from a Domain-specific Language for C-based HLS of Hardware Accelerators
    Reiche, Oliver
    Schmid, Moritz
    Hannig, Frank
    Membarth, Richard
    Teich, Juergen
    [J]. 2014 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2014,
  • [6] A DNN Protection Solution for PIM accelerators with Model Compression
    Zhao, Lei
    Zhang, Youtao
    Yang, Jun
    [J]. 2022 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2022), 2022, : 320 - 325
  • [7] Domain-Specific Language Abstractions for Compression
    Ray, Jessica
    Brahmakshatriya, Ajay
    Wang, Richard
    Kamil, Shoaib
    Reuther, Albert
    Sze, Vivienne
    Amarasinghe, Saman
    [J]. 2021 DATA COMPRESSION CONFERENCE (DCC 2021), 2021, : 364 - 364
  • [8] Domain-specific model differencing for graphical domain-specific languages
    Jafarlou, Manouchehr Zadahmad
    [J]. ACM/IEEE 25TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS 2022 COMPANION, 2022, : 205 - 208
  • [9] A Reconfigurable Platform for the Design and Verification of Domain-Specific Accelerators
    Park, Sungho
    Cho, Yong Cheol Peter
    Irick, Kevin M.
    Narayanan, Vijaykrishnan
    [J]. 2012 17TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2012, : 108 - 113
  • [10] DSMCompare: domain-specific model differencing for graphical domain-specific languages
    Manouchehr Zadahmad
    Eugene Syriani
    Omar Alam
    Esther Guerra
    Juan de Lara
    [J]. Software and Systems Modeling, 2022, 21 : 2067 - 2096