Cutting-Edge Inference: Dynamic DNN Model Partitioning and Resource Scaling for Mobile AI

被引:1
|
作者
Lim, Jeong-A [1 ]
Lee, Joohyun [2 ]
Kwak, Jeongho [3 ]
Kim, Yeongjin [1 ]
机构
[1] Inha Univ, Dept Elect Engn, Incheon 22212, South Korea
[2] Hanyang Univ, Dept Elect & Elect Engn, Ansan 15588, South Korea
[3] Daegu Gyeongbuk Inst Sci & Technol DGIST, Informat & Commun Engn, Daegu 42988, South Korea
基金
新加坡国家研究基金会;
关键词
Mobile handsets; Computational modeling; Servers; Artificial intelligence; Quality of experience; Artificial neural networks; Accuracy; DNN model partitioning; deep learning; mobile edge computing; mobile vision application; quality of experience; ALLOCATION;
D O I
10.1109/TSC.2024.3466848
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, applications using artificial intelligence (AI) technique in mobile devices such as augmented reality have been extensively pervasive. The hardware specifications of mobile devices, dynamic service demands, stochastic network states, and characteristics of DNN (Deep Neural Network) models affect the quality of experience (QoE) of such applications. In this paper, we propose CutEdge , that leverages a virtual queue-based Lyapunov optimization framework to jointly optimize DNN model partitioning between a mobile device and a mobile edge computing (MEC) server and processing/networking resources in a mobile device with respect to internal/external system dynamics. Specifically, CutEdge makes decisions of (i) the partition point of DNN model between the mobile device and MEC server, (ii) GPU clock frequency, and (iii) transmission rates in a mobile device, simultaneously. Then, we theoretically show the optimal trade-off curves among energy consumption, throughput, and end-to-end latency yielded by CutEdge where such QoE metrics have not been jointly addressed in the previous studies. Moreover, we show the impact of joint optimization of three control parameters on the performances via real trace-driven simulations. Finally, we show the superiority of CutEdge over the existing algorithms by experiment on top of implemented testbed using an embedded AI device and an MEC server.
引用
收藏
页码:3300 / 3316
页数:17
相关论文
共 50 条
  • [21] How cutting-edge computer chips are speeding up the AI revolution
    Dan Garisto
    Nature, 2024, 630 (8017) : 544 - 546
  • [22] Fujitsu’s approach to its AI business and cutting-edge technologies
    Yamakage, Yuzuru
    Maruyama, Fumihiro
    Fujitsu Scientific and Technical Journal, 2019, 55 (02): : 3 - 8
  • [23] Pantheon: Preemptible Multi-DNN Inference on Mobile Edge GPUs
    Han, Lixiang
    Zhou, Zimu
    Li, Zhenjiang
    PROCEEDINGS OF THE 2024 THE 22ND ANNUAL INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS, APPLICATIONS AND SERVICES, MOBISYS 2024, 2024, : 465 - 478
  • [24] DNN Surgery: Accelerating DNN Inference on the Edge through Layer Partitioning (vol 11, pg 3111, 2023)
    Liang, Huanghuang
    Sang, Qianlong
    Hu, Chuang
    Cheng, Dazhao
    Zhou, Xiaobo
    Wang, Dan
    Bao, Wei
    Wang, Yu
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2024, 12 (03) : 966 - 966
  • [25] Dynamic DNN Model Selection and Inference Offloading for Video Analytics with Edge-Cloud Collaboration
    Wang, Xuezhi
    Gao, Guanyu
    Wu, Xiaohu
    Lyu, Yan
    Wu, Weiwei
    PROCEEDINGS OF THE 32ND WORKSHOP ON NETWORK AND OPERATING SYSTEMS SUPPORT FOR DIGITAL AUDIO AND VIDEO, NOSSDAV 2022, 2022, : 64 - 70
  • [26] Dynamic Mechanism of Sports Cutting-edge Technique Development and Innovation
    Liang, Cimin
    Liang, Xiao
    ENGINEERING OF SPORT CONFERENCE 2012, 2012, 34 : 337 - 342
  • [27] International human resource management research focus and cutting-edge analysis
    Liang Xin-ru
    Ge Haijuan
    PROCEEDING OF 2012 INTERNATIONAL SYMPOSIUM ON MANAGEMENT OF TECHNOLOGY (ISMOT'2012), 2012, : 495 - 500
  • [28] Joint Multiuser DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence
    Tang, Xin
    Chen, Xu
    Zeng, Liekang
    Yu, Shuai
    Chen, Lin
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (12) : 9511 - 9522
  • [29] EdgeAdaptor: Online Configuration Adaption, Model Selection and Resource Provisioning for Edge DNN Inference Serving at Scale
    Zhao, Kongyange
    Zhou, Zhi
    Chen, Xu
    Zhou, Ruiting
    Zhang, Xiaoxi
    Yu, Shuai
    Wu, Di
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) : 5870 - 5886
  • [30] Creating a cutting-edge neurocomputing model with high precision
    Abed Salman M.
    Al-Janabi S.
    Discover Artificial Intelligence, 2024, 4 (01):