Accelerating DNN Inference by Edge-Cloud Collaboration

被引:2
|
作者
Chen, Jianan [1 ]
Qi, Qi [1 ]
Wang, Jingyu [1 ]
Sun, Haifeng [1 ]
Liao, Jianxin [1 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing, Peoples R China
基金
国家重点研发计划; 中国博士后科学基金; 中国国家自然科学基金;
关键词
DNN inference; edge devices; dynamic partition; intelligent application;
D O I
10.1109/IPCCC51483.2021.9679434
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks (DNN) have become indispensable tools for intelligent applications today. The demand for deploying DNN on the edge devices increases dramatically. Unfortunately, it is challenging because the DNN inference is computation-intensive, but edge devices are always resourceconstraint. Prior solutions attempted to address these challenges with collaboration between cloud and edge devices, but they do not take the inference request rate into account. However, the inference delay will increase dramatically while the request rate becomes higher. In this paper, we propose a scheme to dynamic partition DNN into two or three parts and distribute them at the edge and cloud, achieving the lowest delay with the change of request rate. The scheme selects the optimal partition points of DNN with a layer evaluation model (LEM) and a total delay prediction model (DPM) under different request rates. The experiments of distributed deploying AlexNet, VGG, NiN and ResNet DNN models on image classification dataset ImageNet show that the proposed scheme significantly reduces the total end-to-end latency by fully using both the edge and cloud resources. It reduces the inference delay by 1.3 to 1.6 times and improves the throughput 1.2 to 1.7 times compared to the state of art partition approach.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Dynamic DNN Model Selection and Inference Offloading for Video Analytics with Edge-Cloud Collaboration
    Wang, Xuezhi
    Gao, Guanyu
    Wu, Xiaohu
    Lyu, Yan
    Wu, Weiwei
    [J]. PROCEEDINGS OF THE 32ND WORKSHOP ON NETWORK AND OPERATING SYSTEMS SUPPORT FOR DIGITAL AUDIO AND VIDEO, NOSSDAV 2022, 2022, : 64 - 70
  • [2] JAVP: Joint-Aware Video Processing with Edge-Cloud Collaboration for DNN Inference
    Yang, Zheming
    Ji, Wen
    Guo, Qi
    Wang, Zhi
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9152 - 9160
  • [3] Accelerated Inference of Face Detection under Edge-Cloud Collaboration
    Zhang, Weiwei
    Zhou, Hongbo
    Mo, Jian
    Zhen, Chenghui
    Ji, Ming
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [4] PriPro: Towards Effective Privacy Protection on Edge-Cloud System running DNN Inference
    Gao, Ruiyuan
    Yang, Hailong
    Huang, Shaohan
    Dun, Ming
    Li, Mingzhen
    Luan, Zerong
    Luan, Zhongzhi
    Qian, Depei
    [J]. 21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 334 - 343
  • [5] Characterizing DNN Models for Edge-Cloud Computing
    Xia, Chunwei
    Zhao, Jiacheng
    Cui, Huimin
    Feng, Xiaobing
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2018, : 82 - 83
  • [6] DNN Surgery: Accelerating DNN Inference on the Edge Through Layer Partitioning
    Liang, Huanghuang
    Sang, Qianlong
    Hu, Chuang
    Cheng, Dazhao
    Zhou, Xiaobo
    Wang, Dan
    Bao, Wei
    Wang, Yu
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 3111 - 3125
  • [7] Accelerating DNN Inference With Reliability Guarantee in Vehicular Edge Computing
    Liu, Kai
    Liu, Chunhui
    Yan, Guozhi
    Lee, Victor C. S.
    Cao, Jiannong
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (06) : 3238 - 3253
  • [8] A Splittable DNN-Based Object Detector for Edge-Cloud Collaborative Real-Time Video Inference
    Lee, Joo Chan
    Kim, Yongwoo
    Moon, SungTae
    Ko, Jong Hwan
    [J]. 2021 17TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2021), 2021,
  • [9] Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI
    Yao, Jiangchao
    Zhang, Shengyu
    Yao, Yang
    Wang, Feng
    Ma, Jianxin
    Zhang, Jianwei
    Chu, Yunfei
    Ji, Luo
    Jia, Kunyang
    Shen, Tao
    Wu, Anpeng
    Zhang, Fengda
    Tan, Ziqi
    Kuang, Kun
    Wu, Chao
    Wu, Fei
    Zhou, Jingren
    Yang, Hongxia
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (07) : 6866 - 6886
  • [10] Intelligent Machine Tool Based on Edge-Cloud Collaboration
    Lou, Ping
    Liu, Shiyu
    Hu, Jianmin
    Li, Ruiya
    Xiao, Zheng
    Yan, Junwei
    [J]. IEEE ACCESS, 2020, 8 : 139953 - 139965