Deep learning accelerators: a case study with MAESTRO

被引：5

作者：

Bolhasani, Hamidreza ^{[1
]}

Jassbi, Somayyeh Jafarali ^{[1
]}

机构：

[1] Islamic Azad Univ, Sci & Res Branch, Dept Comp Engn, Tehran, Iran

来源：

JOURNAL OF BIG DATA | 2020年 / 7卷 / 01期

关键词：

Deep learning; Convolutional neural networks; Deep neural networks; Hardware accelerator; Deep learning accelerator;

D O I：

10.1186/s40537-020-00377-8

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In recent years, deep learning has become one of the most important topics in computer sciences. Deep learning is a growing trend in the edge of technology and its applications are now seen in many aspects of our life such as object detection, speech recognition, natural language processing, etc. Currently, almost all major sciences and technologies are benefiting from the advantages of deep learning such as high accuracy, speed and flexibility. Therefore, any efforts in improving performance of related techniques is valuable. Deep learning accelerators are considered as hardware architecture, which are designed and optimized for increasing speed, efficiency and accuracy of computers that are running deep learning algorithms. In this paper, after reviewing some backgrounds on deep learning, a well-known accelerator architecture named MAERI (Multiply-Accumulate Engine with Reconfigurable interconnects) is investigated. Performance of a deep learning task is measured and compared in two different data flow strategies: NLR (No Local Reuse) and NVDLA (NVIDIA Deep Learning Accelerator), using an open source tool called MAESTRO (Modeling Accelerator Efficiency via Spatio-Temporal Resource Occupancy). Measured performance indicators of novel optimized architecture, NVDLA shows higher L1 and L2 computation reuse, and lower total runtime (cycles) in comparison to the other one.

引用

页数：11

共 50 条

[31] Test and Yield Loss Reduction of AI and Deep Learning Accelerators
Sadi, Mehdi
Guin, Ujjwal
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (01) : 104 - 115
[32] SEALing Neural Network Models in Encrypted Deep Learning Accelerators
Zuo, Pengfei
Hua, Yu
Liang, Ling
Xie, Xinfeng
Hu, Xing
Xie, Yuan
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1255 - 1260
[33] Neutrons Sensitivity of Deep Reinforcement Learning Policies on EdgeAI Accelerators
Bodmann, Pablo R.
Saveriano, Matteo
Kritikakou, Angeliki
Rech, Paolo
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2024, 71 (08) : 1480 - 1486
[34] A Survey and Taxonomy of FPGA-based Deep Learning Accelerators
Blaiech, Ahmed Ghazi
Ben Khalifa, Khaled
Valderrama, Carlos
Fernandes, Marcelo A. C.
Bedoui, Mohamed Hedi
JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 98 : 331 - 345
[35] Analyzing and Mitigating Circuit Aging Effects in Deep Learning Accelerators
Das, Sanjay
Kundu, Shamik
Menon, Anand
Ren, Yihui
Kharel, Shubha
Basu, Kanad
2024 IEEE 42ND VLSI TEST SYMPOSIUM, VTS 2024, 2024,
[36] A Comprehensive Evaluation of Novel AI Accelerators for Deep Learning Workloads
Emani, Murali
Xie, Zhen
Raskar, Siddhisanket
Sastry, Varuni
Arnold, William
Wilson, Bruce
Thakur, Rajeev
Vishwanath, Venkatram
Liu, Zhengchun
Papka, Michael E.
Bohorquez, Cindy Orozco
Weisner, Rick
Li, Karen
Sheng, Yongning
Du, Yun
Zhang, Jian
Tsyplikhin, Alexander
Khaira, Gurdaman
Fowers, Jeremy
Sivakumar, Ramakrishnan
Godsoe, Victoria
Macias, Adrian
Tekur, Chetan
Boyd, Matthew
2022 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS), 2022, : 13 - 25
[37] Deep Learning at Scale on NVIDIA V100 Accelerators
Xu, Rengan
Han, Frank
Ta, Quy
PROCEEDINGS OF 2018 IEEE/ACM PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS 2018), 2018, : 23 - 32
[38] FIdelity: Efficient Resilience Analysis Framework for Deep Learning Accelerators
He, Yi
Balaprakash, Prasanna
Li, Yanjing
2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 270 - 281
[39] Co-designed Systems for Deep Learning Hardware Accelerators
Brooks, David M.
2018 INTERNATIONAL SYMPOSIUM ON VLSI TECHNOLOGY, SYSTEMS AND APPLICATION (VLSI-TSA), 2018,
[40] Kernel Mapping Techniques for Deep Learning Neural Network Accelerators
Ozdemir, Sarp
Khasawneh, Mohammad
Rao, Smriti
Madden, Patrick H.
ISPD'22: PROCEEDINGS OF THE 2022 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, 2022, : 21 - 28

← 1 2 3 4 5 →