Context-Aware Deep Model Compression for Edge Cloud Computing

被引:13
|
作者
Wang, Lingdong [1 ]
Xiang, Liyao [1 ]
Xu, Jiayu [1 ]
Chen, Jiaju [1 ]
Zhao, Xing [1 ]
Yao, Dixi [1 ]
Wang, Xinbing [1 ]
Li, Baochun [2 ]
机构
[1] Shanghai Jiao Tong Univ, John Hopcroft Ctr Comp Sci, Shanghai, Peoples R China
[2] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON, Canada
来源
2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS) | 2020年
基金
中国国家自然科学基金;
关键词
Edge Cloud Computing; Neural Architecture Search; Reinforcement Learning;
D O I
10.1109/ICDCS47774.2020.00101
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While deep neural networks (DNNs) have led to a paradigm shift, its exorbitant computational requirement has always been a roadblock in its deployment to the edge, such as wearable devices and smartphones. Hence a hybrid edge-cloud computational framework is proposed to transfer part of the computation to the cloud, by naively partitioning the DNN operations under the constant network condition assumption. However, real-world network state varies greatly depending on the context, and DNN partitioning only has limited strategy space. In this paper, we explore the structural flexibility of DNN to fit the edge model to varying network contexts and different deployment platforms. Specifically, we designed a reinforcement learning-based decision engine to search for model transformation strategies in response to a combined objective of model accuracy and computation latency. The engine generates a context-aware model tree so that the DNN can decide the model branch to switch to at runtime. By the emulation and field experimental results, our approach enjoys a 30% - 50% latency reduction while retaining the model accuracy.
引用
收藏
页码:787 / 797
页数:11
相关论文
共 50 条
  • [41] Usability Evaluation of a Cloud Computing Based Context-aware Healthcare System
    Wang, Shu-Lin
    Kuo, Mu-Hsing
    Chen, Hung-Ming
    Kushniruk, Andre
    Borycki, Elizabeth
    Hsu, Yi-Hsiang
    E-HEALTH - FOR CONTINUITY OF CARE, 2014, 205 : 1194 - 1194
  • [42] IoT-centric Edge Computing for Context-aware Smart Environments
    Cicirelli, Franco
    Guerrieri, Antonio
    Mercuri, Alessandro
    Spezzano, Giandomenico
    Vinci, Andrea
    2018 IEEE INTERNATIONAL CONGRESS ON INTERNET OF THINGS (ICIOT), 2018, : 168 - 171
  • [43] Context-Aware Edge-Cloud Collaborative Scene Text Recognition
    Zhang, Puning
    Liu, Changfeng
    Wang, Honggang
    Wu, Dapeng
    Wang, Ruyan
    Zou, Hong
    2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 611 - 617
  • [44] Context-Aware Fault Classification for Multi-Access Edge Computing
    Ray, Kaustabha
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2024, 21 (06): : 6290 - 6300
  • [45] Edge Computing for Smart Health: Context-Aware Approaches, Opportunities, and Challenges
    Abdellatif, Alaa Awad
    Mohamed, Amr
    Chiasserini, Carla Fabiana
    Tlili, Mounira
    Erbad, Aiman
    IEEE NETWORK, 2019, 33 (03): : 196 - 203
  • [46] Scission: Performance-driven and Context-aware Cloud-Edge Distribution of Deep Neural Networks
    Lockhart, Luke
    Harvey, Paul
    Imai, Pierre
    Willis, Peter
    Varghese, Blesson
    2020 IEEE/ACM 13TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC 2020), 2020, : 257 - 268
  • [47] Context-Aware and Adaptive QoS Prediction for Mobile Edge Computing Services
    Liu, Zhizhong
    Sheng, Quan Z.
    Xu, Xiaofei
    Chu, Dianhui
    Zhang, Wei Emma
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (01) : 400 - 413
  • [48] Context-Aware TDD Configuration and Resource Allocation for Mobile Edge Computing
    Zhao, Pengtao
    Tian, Hui
    Chen, Kwang-Cheng
    Fan, Shaoshuai
    Nie, Gaofeng
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (02) : 1118 - 1131
  • [49] Context-aware regulation of context-aware mobile services in pervasive computing environments
    Syukur, Evi
    Loke, Seng Wai
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2006, PT 4, 2006, 3983 : 138 - 147
  • [50] Target tracking algorithm based on context-aware deep feature compression
    Wang Y.
    Wang A.
    Wang R.
    Liu H.
    Iwahori Y.
    International Journal of Performability Engineering, 2019, 15 (07) : 1802 - 1812