Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference

被引:3
|
作者
AskariHemmat, MohammadHossein [1 ]
Dupuis, Theo [1 ]
Fournier, Yoan [1 ]
El Zarif, Nizar [1 ]
Cavalcante, Matheus [2 ]
Perotti, Matteo [2 ]
Gurkaynak, Frank [2 ]
Benini, Luca [2 ]
Leduc-Primeau, Francois [1 ]
Savaria, Yvon [1 ]
David, Jean-Pierre [1 ]
机构
[1] Ecole Polytech Montreal, Dept Elect Engn, Montreal, PQ, Canada
[2] Swiss Fed Inst Technol, Integrated Syst Lab, Zurich, Switzerland
关键词
RISC-V; Vector ISA; Quantization; Machine Learning; Efficiency; ENERGY;
D O I
10.1109/ISCAS46773.2023.10181985
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations. We also remove the floating-point unit from Quarks' lanes and use the CVA6 RISC-V scalar core for the re-scaling operations that are required in quantized neural network inference. This makes each lane of Quark 2 times smaller and 1.9 times more power efficient compared to the ones of Ara. In this paper we show that Quark can run quantized models at sub-byte precision. Notably we show that for 1-bit and 2-bit quantized models, Quark can accelerate computation of Conv2d over various ranges of inputs and kernel sizes.
引用
收藏
页数:5
相关论文
共 42 条
  • [21] Parallel DNN Inference Framework Leveraging a Compact RISC-V ISA-based Multi-core System
    Zhang, Yipeng
    Du, Bo
    Zhang, Lefei
    Wu, Jia
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 627 - 635
  • [22] XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Networks on RISC-V Based IoT End Nodes
    Garofalo, Angelo
    Tagliavini, Giuseppe
    Conti, Francesco
    Benini, Luca
    Rossi, Davide
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (03) : 1489 - 1505
  • [23] XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Networks on RISC-V based IoT End Nodes
    Garofalo, Angelo
    Tagliavini, Giuseppe
    Conti, Francesco
    Benini, Luca
    Rossi, Davide
    2021 IEEE 28TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH 2021), 2021, : 53 - 53
  • [24] An Eight-Core 1.44-GHz RISC-V Vector Processor in 16-nm FinFET
    Schmidt, Colin
    Wright, John
    Wang, Zhongkai
    Chang, Eric
    Ou, Albert
    Bae, Woorham
    Huang, Sean
    Milovanovic, Vladimir
    Flynn, Anita
    Richards, Brian
    Asanovic, Krste
    Alon, Elad
    Nikolic, Borivoje
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2022, 57 (01) : 140 - 152
  • [25] Task Mapping and Scheduling on RISC-V MIMD Processor With Vector Accelerator Using Model-Based Parallelization
    Wu, Shanwen
    Kumano, Satoshi
    Marume, Kei
    Edahiro, Masato
    IEEE ACCESS, 2024, 12 : 35779 - 35795
  • [26] VPQC: A Domain-Specific Vector Processor for Post-Quantum Cryptography Based on RISC-V Architecture
    Xin, Guozhu
    Han, Jun
    Yin, Tianyu
    Zhou, Yuchao
    Yang, Jianwei
    Cheng, Xu
    Zeng, Xiaoyang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (08) : 2672 - 2684
  • [27] AI-PiM-Extending the RISC-V processor with Processing-in-Memory functional units for AI inference at the edge of IoT
    Verma, Vaibhav
    Stan, Mircea R.
    FRONTIERS IN ELECTRONICS, 2022, 3
  • [28] Mix-GEMM: Extending RISC-V CPUs for Energy-Efficient Mixed-Precision DNN Inference Using Binary Segmentation
    Fornt, Jordi
    Reggiani, Enrico
    Fontova-Musté, Pau
    Rodas, Narcís
    Pappalardo, Alessandro
    Sabri Unsal, Osman
    Kestelman, Adrián Cristal
    Altet, Josep
    Moll, Francesc
    Abella, Jaume
    IEEE Transactions on Computers, 2025, 74 (02) : 582 - 596
  • [29] A 3 TOPS/W RISC-V Parallel Cluster for Inference of Fine-Grain Mixed-Precision Quantized Neural Networks
    Nadalini, Alessandro
    Rutishauser, Georg
    Burrello, Alessio
    Bruschi, Nazareno
    Garofalo, Angelo
    Benini, Luca
    Conti, Francesco
    Rossi, Davide
    2023 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI, ISVLSI, 2023, : 145 - 150
  • [30] muRISCV-NN: Challenging Zve32x Autovectorization with TinyML Inference Library for RISC-V Vector Extension
    van Kempen, Philipp
    Jones, Jefferson Parker
    Mueller-Gritschneder, Daniel
    Schlichtmann, Ulf
    PROCEEDINGS OF THE 21ST ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2024-WORKSHOPS AND SPECIAL SESSIONS, CF 2024 COMPANION, 2024, : 75 - 78