xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems

被引:0
|
作者
Rutishauser, Georg [1 ]
Mihali, Joan [2 ]
Scherer, Moritz [1 ]
Benini, Luca [1 ,2 ]
机构
[1] Swiss Fed Inst Technol, Dept Informat Technol & Elektrotech, Zurich, Switzerland
[2] Univ Bologna, Dipartimento Ingn Energia Elettr & Informaz, Bologna, Italy
关键词
MULTIPLICATION; ACCELERATOR;
D O I
10.1109/ASAP61560.2024.00049
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Ternary neural networks (TNNs) offer a superior accuracy-energy trade-off compared to binary neural networks. However, until now, they have required specialized accelerators to realize their efficiency potential, which has hindered widespread adoption. To address this, we present xTern, a lightweight extension of the RISC-V instruction set architecture (ISA) targeted at accelerating TNN inference on general-purpose cores. To complement the ISA extension, we developed a set of optimized kernels leveraging xTern, achieving 67% higher throughput than their 2-bit equivalents. Power consumption is only marginally increased by 5.2 %, resulting in an energy efficiency improvement by 57.1 %. We demonstrate that the proposed xTern extension, integrated into an octa-core compute cluster, incurs a minimal silicon area overhead of 0.9% with no impact on timing. In end-to-end benchmarks, we demonstrate that xTern enables the deployment of TNNs achieving up to 1.6 percentage points higher CIFAR-10 classification accuracy than 2-bit networks at equal inference latency. Our results show that xTern enables RISCV-based ultra-low-power edge AI platforms to benefit from the efficiency potential of TNNs.
引用
收藏
页码:206 / 213
页数:8
相关论文
共 50 条
  • [1] EXTREM-EDGE-EXtensions To RISC-V for Energy-efficient ML inference at the EDGE of IoT
    Verma, Vaibhav
    Tracy II, Tommy
    Stan, Mircea R.
    [J]. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2022, 35
  • [2] Energy-Efficient RISC-V-Based Vector Processor for Cache-Aware Structurally-Pruned Transformers
    Min, Jung Gyu
    Kam, Dongyun
    Byun, Younghoon
    Park, Gunho
    Lee, Youngjoo
    [J]. 2023 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED, 2023,
  • [3] Energy-Efficient Approximate Edge Inference Systems
    Ghosh, Soumendu Kumar
    Raha, Arnab
    Raghunathan, Vijay
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (04)
  • [4] Detection of atrial fibrillation with an optimized neural network on a RISC-V-based microcontroller for efficient integration into ECG patches
    Hoyer, Ingo
    Utz, Alexander
    Ludecke, Andre
    Richter, Mike
    Wichum, Felix
    Gembaczka, Pierre
    Kohler, Kerstin
    Rohr, Maurice
    Antink, Christoph Hoog
    Seidl, Karsten
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS (MEMEA 2022), 2022,
  • [5] An Energy-Efficient Method for Recurrent Neural Network Inference in Edge Cloud Computing
    Chen, Chao
    Guo, Weiyu
    Wang, Zheng
    Yang, Yongkui
    Wu, Zhuoyu
    Li, Guannan
    [J]. SYMMETRY-BASEL, 2022, 14 (12):
  • [6] Energy-efficient cooperative inference via adaptive deep neural network splitting at the edge
    Labriji, Ibtissam
    Merluzzi, Mattia
    Airod, Fatima Ezzahra
    Strinati, Emilio Calvanese
    [J]. ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1712 - 1717
  • [7] Energy-Efficient Neural Network Inference with Microcavity Exciton Polaritons
    Matuszewski, M.
    Opala, A.
    Mirek, R.
    Furman, M.
    Krol, M.
    Tyszka, K.
    Liew, T. C. H.
    Ballarini, D.
    Sanvitto, D.
    Szczytko, J.
    Pietka, B.
    [J]. PHYSICAL REVIEW APPLIED, 2021, 16 (02)
  • [8] An Energy-Efficient Inference Engine for a Configurable ReRAM-Based Neural Network Accelerator
    Zheng, Yang-Lin
    Yang, Wei-Yi
    Chen, Ya-Shu
    Han, Ding-Hung
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (03) : 740 - 753
  • [9] TIE: Energy-efficient Tensor Train-based Inference Engine for Deep Neural Network
    Deng, Chunhua
    Sun, Fangxuan
    Qian, Xuehai
    Lin, Jun
    Wang, Zhongfeng
    Yuan, Bo
    [J]. PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 264 - 277
  • [10] Energy-Efficient Distributed Spiking Neural Network for Wireless Edge Intelligence
    Liu, Yanzhen
    Qin, Zhijin
    Li, Geoffrey Ye
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (09) : 10683 - 10697