High performance accelerators for deep neural networks: A review

被引:6
|
作者
Akhoon, Mohd Saqib [1 ]
Suandi, Shahrel A. [1 ]
Alshahrani, Abdullah [2 ]
Saad, Abdul-Malik H. Y. [3 ]
Albogamy, Fahad R. [4 ]
Bin Abdullah, Mohd Zaid [1 ]
Loan, Sajad A. [5 ]
机构
[1] Univ Sains, Sch Elect & Elect Engn, Intelligent Biometr Grp, George Town, Malaysia
[2] Univ Jeddah, Dept Comp Sci & Artificial Intelligence, Coll Comp Sci & Engn, Jeddah, Saudi Arabia
[3] Univ Teknol Malaysia, Div Elect & Comp Engn, Fac Engn, Sch Elect Engn, Johor Baharu, Kagawa, Malaysia
[4] Taif Univ, Saudi Univ, Turabah Univ Coll, Comp Sci Program, At Taif, Saudi Arabia
[5] Jamia Millia Islamia, Dept Elect & Commun, New Delhi 11025, India
关键词
artificial intelligence; convolutional neural networks; deep neural network; machine learning; accelerators; CNN;
D O I
10.1111/exsy.12831
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The availability of huge structured and unstructured data, advanced highly dense memory and high performance computing machines have provided a strong push for the development in artificial intelligence (AI) and machine learning (ML) domains. AI and machine learning has rekindled the hope of efficiently solving complex problems which was not possible in the recent past. The generation and availability of big-data is a strong driving force for the development of AI/ML applications, however, several challenges need to be addressed, like processing speed, memory requirement, high bandwidth, low latency memory access, and highly conductive and flexible connections between processing units and memory blocks. The conventional computing platforms are unable to address these issues with machine learning and AI. Deep neural networks (DNNs) are widely employed for machine learning and AI applications, like speech recognition, computer vison, robotics, and so forth, efficiently and accurately. However, accuracy is achieved at the cost of high computational complexity, sacrificing energy efficiency and throughput like performance measuring parameters along with high latency. To address the problems of latency, energy efficiency, complexity, power consumption, and so forth, a lot of state of the art DNN accelerators have been designed and implemented in the form of application specific integrated circuits (ASICs) and field programmable gate arrays (FPGAs). This work provides the state of the art of all these DNN accelerators which have been developed recently. Various DNN architectures, their computing units, emerging technologies used in improving the performance of DNN accelerators will be discussed. Finally, we will try to explore the scope for further improvement in these accelerator designs, various opportunities and challenges for the future research.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Polymorphic Accelerators for Deep Neural Networks
    Azizimazreah, Arash
    Chen, Lizhong
    IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (03) : 534 - 546
  • [2] Review of ASIC accelerators for deep neural network
    Machupalli, Raju
    Hossain, Masum
    Mandal, Mrinal
    MICROPROCESSORS AND MICROSYSTEMS, 2022, 89
  • [3] Deep neural networks accelerators with focus on tensor processors
    Bolhasani, Hamidreza
    Marandinejad, Mohammad
    MICROPROCESSORS AND MICROSYSTEMS, 2024, 105
  • [4] CANN: Curable Approximations for High-Performance Deep Neural Network Accelerators
    Hanif, Muhammad Abdullah
    Khalid, Faiq
    Shafique, Muhammad
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [5] Optimization of Analog Accelerators for Deep Neural Networks Inference
    Fasoli, Andrea
    Ambrogio, Stefano
    Narayanan, Pritish
    Tsai, Hsinyu
    Mackin, Charles
    Spoon, Katherine
    Friz, Alexander
    Chen, An
    Burr, Geoffrey W.
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [6] POSTER: Design Space Exploration for Performance Optimization of Deep Neural Networks on Shared Memory Accelerators
    Venkataramani, Swagath
    Choi, Jungwook
    Srinivasan, Vijayalakshmi
    Gopalakrishnan, Kailash
    Chang, Leland
    2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2017, : 146 - 147
  • [7] An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators
    Nabavinejad, Seyed Morteza
    Baharloo, Mohammad
    Chen, Kun-Chih
    Palesi, Maurizio
    Kogel, Tim
    Ebrahimi, Masoumeh
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2020, 10 (03) : 268 - 282
  • [8] Direct training high-performance deep spiking neural networks: a review of theories and methods
    Zhou, Chenlin
    Zhang, Han
    Yu, Liutao
    Ye, Yumin
    Zhou, Zhaokun
    Huang, Liwei
    Ma, Zhengyu
    Fan, Xiaopeng
    Zhou, Huihui
    Tian, Yonghong
    FRONTIERS IN NEUROSCIENCE, 2024, 18
  • [9] High performance reconfigurable accelerator for deep convolutional neural networks
    Qiao R.
    Chen G.
    Gong G.
    Lu H.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (03): : 130 - 139
  • [10] Automated optimization for memory-efficient high-performance deep neural network accelerators
    Kim, HyunMi
    Lyuh, Chun-Gi
    Kwon, Youngsu
    ETRI JOURNAL, 2020, 42 (04) : 505 - 517