Scaling for edge inference of deep neural networks

被引:287
|
作者
Xu, Xiaowei [1 ]
Ding, Yukun [1 ]
Hu, Sharon Xiaobo [1 ]
Niemier, Michael [1 ]
Cong, Jason [2 ]
Hu, Yu [3 ]
Shi, Yiyu [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci, Notre Dame, IN 46556 USA
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
[3] Huazhong Univ Sci & Technol, Sch Opt & Elect Informat, Wuhan, Hubei, Peoples R China
来源
NATURE ELECTRONICS | 2018年 / 1卷 / 04期
关键词
ENERGY;
D O I
10.1038/s41928-018-0059-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep neural networks offer considerable potential across a range of applications, from advanced manufacturing to autonomous cars. A clear trend in deep neural networks is the exponential growth of network size and the associated increases in computational complexity and memory consumption. However, the performance and energy efficiency of edge inference, in which the inference (the application of a trained network to new data) is performed locally on embedded platforms that have limited area and power budget, is bounded by technology scaling. Here we analyse recent data and show that there are increasing gaps between the computational complexity and energy efficiency required by data scientists and the hardware capacity made available by hardware architects. We then discuss various architecture and algorithm innovations that could help to bridge the gaps.
引用
下载
收藏
页码:216 / 222
页数:7
相关论文
共 50 条
  • [31] Many Models at the Edge: Scaling Deep Inference via Model-Level Caching
    Ogden, Samuel S.
    Gilman, Guin R.
    Walls, Robert J.
    Guo, Tian
    2021 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS (ACSOS 2021), 2021, : 51 - 60
  • [32] Measuring the Uncertainty of Predictions in Deep Neural Networks with Variational Inference
    Steinbrener, Jan
    Posch, Konstantin
    Pilz, Juergen
    SENSORS, 2020, 20 (21) : 1 - 22
  • [33] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
    Demirci, Gunduz Vehbi
    Ferhatosmanoglu, Hakan
    PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265
  • [34] Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks
    Zhao, Shize
    He, Liulu
    Xie, Xiaoru
    Lin, Jun
    Wang, Zhongfeng
    2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 117 - 122
  • [35] Photorealistic Facial Texture Inference Using Deep Neural Networks
    Saito, Shunsuke
    Wei, Lingyu
    Hu, Liwen
    Nagano, Koki
    Li, Hao
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2326 - 2335
  • [36] Mesoscopic Facial Geometry Inference Using Deep Neural Networks
    Huynh, Loc
    Chen, Weikai
    Saito, Shunsuke
    Xing, Jun
    Nagano, Koki
    Jones, Andrew
    Debevec, Paul
    Li, Hao
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8407 - 8416
  • [37] Fast inference of deep neural networks in FPGAs for particle physics
    Duarte, J.
    Han, S.
    Harris, P.
    Jindariani, S.
    Kreinar, E.
    Kreis, B.
    Ngadiuba, J.
    Pierini, M.
    Rivera, R.
    Tran, N.
    Wu, Z.
    JOURNAL OF INSTRUMENTATION, 2018, 13
  • [38] Redundant feature pruning for accelerated inference in deep neural networks
    Ayinde, Babajide O.
    Inanc, Tamer
    Zurada, Jacek M.
    NEURAL NETWORKS, 2019, 118 : 148 - 158
  • [39] ACCURATE AND EFFICIENT FIXED POINT INFERENCE FOR DEEP NEURAL NETWORKS
    Rajagopal, Vasanthakumar
    Ramasamy, Chandra Kumar
    Vishnoi, Ashok
    Gadde, Raj Narayana
    Miniskar, Narasinga Rao
    Pasupuleti, Sirish Kumar
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1847 - 1851
  • [40] Neural Networks Meet Physical Networks: Distributed Inference Between Edge Devices and the Cloud
    Chinchali, Sandeep P.
    Cidon, Eyal
    Pergament, Evgenya
    Chu, Tianshu
    Katti, Sachin
    HOTNETS-XVII: PROCEEDINGS OF THE 2018 ACM WORKSHOP ON HOT TOPICS IN NETWORKS, 2018, : 50 - 56