HAPI: Hardware-Aware Progressive Inference

被引:23
|
作者
Laskaridis, Stefanos [1 ]
Venieris, Stylianos, I [1 ]
Kim, Hyeji [1 ]
Lane, Nicholas D. [1 ,2 ]
机构
[1] Samsung AI Ctr, Cambridge, England
[2] Univ Cambridge, Cambridge, England
关键词
MULTIOBJECTIVE OPTIMIZATION;
D O I
10.1145/3400302.3415698
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN inference still comes at a high computational cost. A growing body of work aims to alleviate this by exploiting the difference in the classification difficulty among samples and early-exiting at different stages of the network. Nevertheless, existing studies on early exiting have primarily focused on the training scheme, without considering the use-case requirements or the deployment platform. This work presents HAPI, a novel methodology for generating high-performance early-exit networks by co-optimising the placement of intermediate exits together with the early-exit strategy at inference time. Furthermore, we propose an efficient design space exploration algorithm which enables the faster traversal of a large number of alternative architectures and generates the highest-performing design, tailored to the use-case requirements and target hardware. Quantitative evaluation shows that our system consistently outperforms alternative search mechanisms and state-of-the-art early-exit schemes across various latency budgets. Moreover, it pushes further the performance of highly optimised hand-crafted early-exit CNNs, delivering up to 5.11x speedup over lightweight models on imposed latency-driven SLAs for embedded devices.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] A survey on hardware-aware and heterogeneous computing on multicore processors and accelerators
    Buchty, Rainer
    Heuveline, Vincent
    Karl, Wolfgang
    Weiss, Jan-Philipp
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (07): : 663 - 675
  • [32] Hardware-aware Model Architecture for Ternary Spiking Neural Networks
    Wu, Nai-Chun
    Chen, Tsu-Hsiang
    Huang, Chih-Tsun
    2023 INTERNATIONAL VLSI SYMPOSIUM ON TECHNOLOGY, SYSTEMS AND APPLICATIONS, VLSI-TSA/VLSI-DAT, 2023,
  • [33] AI Models for Edge Computing: Hardware-aware Optimizations for Efficiency
    Li, Hai ''Helen''
    2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2024,
  • [34] Hardware-Aware Bayesian Neural Architecture Search of Quantized CNNs
    Perrin, Mathieu
    Guicquero, William
    Paille, Bruno
    Sicard, Gilles
    IEEE EMBEDDED SYSTEMS LETTERS, 2025, 17 (01) : 42 - 45
  • [35] On Hardware-Aware Probabilistic Frameworks for Resource Constrained Embedded Applications
    Olascoaga, Laura I. Galindez
    Meert, Wannes
    Shah, Nimish
    Van den Broeck, Guy
    Verhelst, Marian
    FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 66 - 70
  • [36] SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
    Kinnison, Jeffery
    Kremer-Herman, Nathaniel
    Thain, Douglas
    Scheirer, Walter
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 738 - 747
  • [37] Performance evaluation and design of hardware-aware PDE solvers:: An introduction
    Hülsemann, F
    Kowarschik, M
    APPLIED PARALLEL COMPUTING: STATE OF THE ART IN SCIENTIFIC COMPUTING, 2006, 3732 : 872 - 873
  • [38] Hardware-aware AutoML for Exploration of Custom FPGA Accelerators for RadioML
    Jentzsch, Felix
    2023 33RD INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL, 2023, : 359 - 360
  • [39] HAW: Hardware-Aware Point Selection for Efficient Winograd Convolution
    Li, Chaoran
    Jiang, Penglong
    Zhou, Hui
    Wang, Xiaofeng
    Zhao, Xiongbo
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 269 - 273
  • [40] HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator
    Yu, Zhewen
    Sreeram, Sudarshan
    Agrawal, Krish
    Wu, Junyi
    Montgomerie-Corcoran, Alexander
    Zhang, Cheng
    Cheng, Jianyi
    Bouganis, Christos-Savvas
    Zhao, Yiren
    2024 34TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL 2024, 2024, : 257 - 263