Scaling for edge inference of deep neural networks

被引:287
|
作者
Xu, Xiaowei [1 ]
Ding, Yukun [1 ]
Hu, Sharon Xiaobo [1 ]
Niemier, Michael [1 ]
Cong, Jason [2 ]
Hu, Yu [3 ]
Shi, Yiyu [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci, Notre Dame, IN 46556 USA
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
[3] Huazhong Univ Sci & Technol, Sch Opt & Elect Informat, Wuhan, Hubei, Peoples R China
来源
NATURE ELECTRONICS | 2018年 / 1卷 / 04期
关键词
ENERGY;
D O I
10.1038/s41928-018-0059-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep neural networks offer considerable potential across a range of applications, from advanced manufacturing to autonomous cars. A clear trend in deep neural networks is the exponential growth of network size and the associated increases in computational complexity and memory consumption. However, the performance and energy efficiency of edge inference, in which the inference (the application of a trained network to new data) is performed locally on embedded platforms that have limited area and power budget, is bounded by technology scaling. Here we analyse recent data and show that there are increasing gaps between the computational complexity and energy efficiency required by data scientists and the hardware capacity made available by hardware architects. We then discuss various architecture and algorithm innovations that could help to bridge the gaps.
引用
收藏
页码:216 / 222
页数:7
相关论文
共 50 条
  • [1] Scaling for edge inference of deep neural networks
    Xiaowei Xu
    Yukun Ding
    Sharon Xiaobo Hu
    Michael Niemier
    Jason Cong
    Yu Hu
    Yiyu Shi
    [J]. Nature Electronics, 2018, 1 : 216 - 222
  • [2] Collaborative Inference for Deep Neural Networks in Edge Environments
    Liu, Meizhao
    Gu, Yingcheng
    Dong, Sen
    Wei, Liu
    Liu, Kai
    Yan, Yuting
    Song, Yu
    Cheng, Huanyu
    Tang, Lei
    Zhang, Sheng
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (07): : 1749 - 1773
  • [3] Poster: Scaling Up Deep Neural Network Optimization for Edge Inference
    Lu, Bingqian
    Yang, Jianyi
    Ren, Shaolei
    [J]. 2020 IEEE/ACM SYMPOSIUM ON EDGE COMPUTING (SEC 2020), 2020, : 170 - 172
  • [4] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
    Farrell, Max H.
    Liang, Tengyuan
    Misra, Sanjog
    [J]. ECONOMETRICA, 2021, 89 (01) : 181 - 213
  • [5] Property Inference for Deep Neural Networks
    Gopinath, Divya
    Converse, Hayes
    Pasareanu, Corina S.
    Taly, Ankur
    [J]. 34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 809 - 821
  • [6] Weightless Neural Networks for Efficient Edge Inference
    Susskind, Zachary
    Arora, Aman
    Miranda, Igor D. S.
    Villon, Luis A. Q.
    Katopodis, Rafael F.
    de Araujo, Leandro S.
    Dutra, Diego L. C.
    Lima, Priscila M. V.
    Franca, Felipe M. G.
    Breternitz, Mauricio, Jr.
    John, Lizy K.
    [J]. PROCEEDINGS OF THE 2022 31ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT 2022, 2022, : 279 - 290
  • [7] Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput
    Parthasarathy, Arjun
    Krishnamachari, Bhaskar
    [J]. 2022 32ND INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC), 2022, : 239 - 246
  • [8] Fully Dynamic Inference With Deep Neural Networks
    Xia, Wenhan
    Yin, Hongxu
    Dai, Xiaoliang
    Jha, Niraj K.
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (02) : 962 - 972
  • [9] Secure and Verifiable Inference in Deep Neural Networks
    Xu, Guowen
    Li, Hongwei
    Ren, Hao
    Sun, Jianfei
    Xu, Shengmin
    Ning, Jianting
    Yang, Haomiao
    Yang, Kan
    Deng, Robert H.
    [J]. 36TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2020), 2020, : 784 - 797
  • [10] xDNN: Inference for Deep Convolutional Neural Networks
    D'Alberto, Paolo
    Wu, Victor
    Ng, Aaron
    Nimaiyar, Rahul
    Delaye, Elliott
    Sirasao, Ashish
    [J]. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2022, 15 (02)