Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

被引:49
|
作者
Cai, Han [1 ]
Lin, Ji [1 ]
Lin, Yujun [1 ]
Liu, Zhijian [1 ]
Tang, Haotian [1 ]
Wang, Hanrui [1 ]
Zhu, Ligeng [1 ]
Han, Song [1 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
Efficient deep learning; TinyML; model compression; AutoML; neural architecture search; NEURAL-NETWORK ACCELERATOR; ARCHITECTURE; IMPLEMENTATION; COPROCESSOR; PREDICTION; MODEL;
D O I
10.1145/3486618
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing, and speech recognition. However, their superior performance comes at the considerable cost of computational complexity, which greatly hinders their applications in many resource-constrained devices, such as mobile phones and Internet of Things (IoT) devices. Therefore, methods and techniques that are able to lift the efficiency bottleneck while preserving the high accuracy of DNNs are in great demand to enable numerous edge AI applications. This article provides an overview of efficient deep learning methods, systems, and applications. We start from introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design. To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization. We then cover efficient on-device training to enable user customization based on the local data on mobile devices. Apart from general acceleration techniques, we also showcase several task-specific accelerations for point cloud, video, and natural language processing by exploiting their spatial sparsity and temporal/token redundancy. Finally, to support all these algorithmic advancements, we introduce the efficient deep learning system design from both software and hardware perspectives.
引用
收藏
页数:50
相关论文
共 50 条
  • [41] Machine learning and deep learning methods for wireless network applications
    Chen, Abel C. H.
    Jia, Wen-Kang
    Hwang, Feng-Jang
    Liu, Genggeng
    Song, Fangying
    Pu, Lianrong
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2022, 2022 (01)
  • [42] A mobile agent architecture to enable enterprise information access for mobile devices
    Krishnamurthy, S
    Zeid, I
    PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 2001, : 266 - 271
  • [43] Systems analysis of webpage integration frameworks into mobile devices' applications
    Bumanis, Nikolajs
    AICT 2013: APPLIED INFORMATION AND COMMUNICATION TECHNOLOGIES, 2013, : 251 - 258
  • [44] Considerations in systems development of applications for mobile devices - A case study
    Dawson, L
    Fisher, J
    CONSTRUCTING THE INFRASTRUCTURE FOR THE KNOWLEGE ECONOMY: METHODS AND TOOLS, THEORY AND STRUCTURE, 2004, : 661 - 673
  • [45] Accelerating Applications using GPUs on Embedded Systems and Mobile Devices
    Huang, Miaoqing
    Lai, Chenggang
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 1031 - 1038
  • [46] Thermal-Aware Scheduling for Deep Learning on Mobile Devices With NPU
    Tan, Tianxiang
    Cao, Guohong
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 10706 - 10719
  • [47] A Preliminary Investigation into a Deep Learning Implementation for Hand Tracking on Mobile Devices
    Gruosso, Monica
    Capece, Nicola
    Erra, Ugo
    Angiolillo, Francesco
    2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY (AIVR 2020), 2020, : 380 - 385
  • [48] Kollector: Detecting Fraudulent Activities on Mobile Devices Using Deep Learning
    Sun, Lichao
    Cao, Bokai
    Wang, Ji
    Srisa-an, Witawas
    Yu, Philip S.
    Leow, Alex D.
    Checkoway, Stephen
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2021, 20 (04) : 1465 - 1476
  • [49] A collaborative CPU-GPU approach for deep learning on mobile devices
    Valery, Olivier
    Liu, Pangfeng
    Wu, Jan-Jan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (17):
  • [50] Deep Learning-Enhanced Physical Layer Authentication for Mobile Devices
    Guo, Yijia
    Zhang, Junqing
    Hong, Y. -W. Peter
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 826 - 831