Context-Aware Deep Model Compression for Edge Cloud Computing

被引:13
|
作者
Wang, Lingdong [1 ]
Xiang, Liyao [1 ]
Xu, Jiayu [1 ]
Chen, Jiaju [1 ]
Zhao, Xing [1 ]
Yao, Dixi [1 ]
Wang, Xinbing [1 ]
Li, Baochun [2 ]
机构
[1] Shanghai Jiao Tong Univ, John Hopcroft Ctr Comp Sci, Shanghai, Peoples R China
[2] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON, Canada
来源
2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS) | 2020年
基金
中国国家自然科学基金;
关键词
Edge Cloud Computing; Neural Architecture Search; Reinforcement Learning;
D O I
10.1109/ICDCS47774.2020.00101
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While deep neural networks (DNNs) have led to a paradigm shift, its exorbitant computational requirement has always been a roadblock in its deployment to the edge, such as wearable devices and smartphones. Hence a hybrid edge-cloud computational framework is proposed to transfer part of the computation to the cloud, by naively partitioning the DNN operations under the constant network condition assumption. However, real-world network state varies greatly depending on the context, and DNN partitioning only has limited strategy space. In this paper, we explore the structural flexibility of DNN to fit the edge model to varying network contexts and different deployment platforms. Specifically, we designed a reinforcement learning-based decision engine to search for model transformation strategies in response to a combined objective of model accuracy and computation latency. The engine generates a context-aware model tree so that the DNN can decide the model branch to switch to at runtime. By the emulation and field experimental results, our approach enjoys a 30% - 50% latency reduction while retaining the model accuracy.
引用
收藏
页码:787 / 797
页数:11
相关论文
共 50 条
  • [31] The ethos of context-aware computing
    Pintilie, Sorin
    Interactions, 2015, 22 (04) : 20 - 21
  • [32] Context-aware computing with sound
    Madhavapeddy, A
    Scott, D
    Sharp, R
    UBICOMP 2003: UBIQUITOUS COMPUTING, 2003, 2864 : 315 - 332
  • [33] A review of context-aware computing
    Lee, Yun-mi
    Hong, Jong-yi
    Oh, Won-il
    Kang, Hyeon
    Suh, Eui-ho
    PROCEEDINGS OF THE 11TH ANNUAL CONFERENCE OF ASIA PACIFIC DECISION SCIENCES INSTITUTE: INNOVATION & SERVICE EXCELLENCE FOR COMPETITIVE ADVANTAGE IN THE GLOBAL ENVIRONMENT, 2006, : 315 - +
  • [34] An efficient context-aware coordination model for ubiquitous computing
    Sudha, R.
    Rajagopalan, M. R.
    Sridevi, S.
    Selvi, S. Thamarai
    2007 FOURTH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE AND UBIQUITOUS SYSTEMS: NETWORKING & SERVICES, 2007, : 315 - +
  • [35] A Context-Aware Museum-Guide System Based on Cloud Computing
    Vahdat-Nejad, Hamed
    Navabi, Mohammad Sadeq
    Khosravi-Mahmouei, Hosein
    INTERNATIONAL JOURNAL OF CLOUD APPLICATIONS AND COMPUTING, 2018, 8 (04) : 1 - 19
  • [36] A Context-Aware Architecture Supporting Service Availability in Mobile Cloud Computing
    Guerrero-Contreras, Gabriel
    Luis Garrido, Jose
    Balderas-Diaz, Sara
    Rodriguez-Dominguez, Carlos
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2017, 10 (06) : 956 - 968
  • [37] Ontology-based context-aware SLA management for cloud computing
    Labidi, Taher
    Mtibaa, Achraf
    Gargouri, Faiez
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8748 : 193 - 208
  • [38] Ontology-Based Context-Aware SLA Management for Cloud Computing
    Labidi, Taher
    Mtibaa, Achraf
    Gargouri, Faiez
    MODEL AND DATA ENGINEERING, MEDI 2014, 2014, 8748 : 193 - 208
  • [39] Context-Aware Compilation of DNN Training Pipelines across Edge and Cloud
    Yao, Dixi
    Xiang, Liyao
    Wang, Zifan
    Xu, Jiayu
    Li, Chao
    Wang, Xinbing
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2021, 5 (04):
  • [40] A Context-Aware Edge-Cloud Collaboration Framework for QoS Prediction
    Cheng, Yong
    Cao, Weihao
    Fang, Hao
    Zang, Shaobo
    TSINGHUA SCIENCE AND TECHNOLOGY, 2025, 30 (03): : 1201 - 1214