Deep Learning Accelerators' Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study

被引:3
|
作者
Gookyi, Dennis Agyemanh Nana [1 ]
Lee, Eunchong [2 ]
Kim, Kyungho [2 ]
Jang, Sung-Joon [2 ]
Lee, Sang-Seol [2 ]
机构
[1] CSIR, Inst Sci & Technol Informat, Elect Div, Accra, Ghana
[2] Korea Elect Technol Inst, Intelligent Image Proc Res Ctr, Seongnam Si 13488, South Korea
关键词
deep learning; hardware accelerators; open-source; Gemmini; systolic array; GEMM; output; weight stationary dataflow; FPGA; image-to-column;
D O I
10.3390/s23052380
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Though custom deep learning (DL) hardware accelerators are attractive for making inferences in edge computing devices, their design and implementation remain a challenge. Open-source frameworks exist for exploring DL hardware accelerators. Gemmini is an open-source systolic array generator for agile DL accelerator exploration. This paper details the hardware/software components generated using Gemmini. The general matrix-to-matrix multiplication (GEMM) of different dataflow options, including output/weight stationary (OS/WS), was explored in Gemmini to estimate the performance relative to a CPU implementation. The Gemmini hardware was implemented on an FPGA device to explore the effect of several accelerator parameters, including array size, memory capacity, and the CPU/hardware image-to-column (im2col) module, on metrics such as the area, frequency, and power. This work revealed that regarding the performance, the WS dataflow offered a speedup of 3x relative to the OS dataflow, and the hardware im2col operation offered a speedup of 1.1x relative to the operation on the CPU. For hardware resources, an increase in the array size by a factor of 2 led to an increase in both the area and power by a factor of 3.3, and the im2col module led to an increase in area and power by factors of 1.01 and 1.06, respectively.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Dynamic configuration optimization of FPGA accelerators through reinforcement learning for enhanced performance and resource utilization
    Pal, Sandipan
    Upadhyaya, Bijoy Kumar
    Majumder, Tanmoy
    Dasgupta, Sudeb
    Das, Narottam
    Bhattacharjee, Abhishek
    ENGINEERING RESEARCH EXPRESS, 2025, 7 (01):
  • [2] Deep learning accelerators: a case study with MAESTRO
    Bolhasani, Hamidreza
    Jassbi, Somayyeh Jafarali
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [3] Deep learning accelerators: a case study with MAESTRO
    Hamidreza Bolhasani
    Somayyeh Jafarali Jassbi
    Journal of Big Data, 7
  • [4] Resource-Guided Configuration Space Reduction for Deep Learning Models
    Gao, Yanjie
    Zhu, Yonghao
    Zhang, Hongyu
    Lin, Haoxiang
    Yang, Mao
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 175 - 187
  • [5] A Fast Design Space Exploration Framework for the Deep Learning Accelerators: Work-in-Progress
    Colucci, Alessio
    Marchisio, Alberto
    Bussolino, Beatrice
    Mrazek, Voitech
    Martina, Maurizio
    Masera, Guido
    Shafique, Muhammad
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2019, : 34 - 36
  • [6] POSTER: Design Space Exploration for Performance Optimization of Deep Neural Networks on Shared Memory Accelerators
    Venkataramani, Swagath
    Choi, Jungwook
    Srinivasan, Vijayalakshmi
    Gopalakrishnan, Kailash
    Chang, Leland
    2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2017, : 146 - 147
  • [7] Exploring Software Models for the Resilience Analysis of Deep Learning Accelerators: the NVDLA Case Study
    Veronesi, A.
    Dall'Occo, F.
    Bertozzi, D.
    Favalli, M.
    Krstic, M.
    2022 25TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS AND SYSTEMS (DDECS), 2022, : 142 - 147
  • [8] Laptops in Space: A Sparing Analysis Case Study for Personal Computers Used in Deep Space Exploration
    Meyer, Nicholas
    2019 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM (RAMS 2019) - R & M IN THE SECOND MACHINE AGE - THE CHALLENGE OF CYBER PHYSICAL SYSTEMS, 2019,
  • [9] Exploration of utilization of expired HCQS in oilfield water treatment: a case study of waste resource recycling
    Yi Luo
    Zhongying Xu
    Hai Lin
    Jun Xu
    Qiongwei Li
    Gang Chen
    Ying Tang
    Research on Chemical Intermediates, 2025, 51 (5) : 2755 - 2773
  • [10] Deep learning in sports skill learning: a case study and performance evaluation
    Lian D.
    EAI Endorsed Transactions on Pervasive Health and Technology, 2024, 10