Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

被引:49
|
作者
Cai, Han [1 ]
Lin, Ji [1 ]
Lin, Yujun [1 ]
Liu, Zhijian [1 ]
Tang, Haotian [1 ]
Wang, Hanrui [1 ]
Zhu, Ligeng [1 ]
Han, Song [1 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
Efficient deep learning; TinyML; model compression; AutoML; neural architecture search; NEURAL-NETWORK ACCELERATOR; ARCHITECTURE; IMPLEMENTATION; COPROCESSOR; PREDICTION; MODEL;
D O I
10.1145/3486618
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing, and speech recognition. However, their superior performance comes at the considerable cost of computational complexity, which greatly hinders their applications in many resource-constrained devices, such as mobile phones and Internet of Things (IoT) devices. Therefore, methods and techniques that are able to lift the efficiency bottleneck while preserving the high accuracy of DNNs are in great demand to enable numerous edge AI applications. This article provides an overview of efficient deep learning methods, systems, and applications. We start from introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design. To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization. We then cover efficient on-device training to enable user customization based on the local data on mobile devices. Apart from general acceleration techniques, we also showcase several task-specific accelerations for point cloud, video, and natural language processing by exploiting their spatial sparsity and temporal/token redundancy. Finally, to support all these algorithmic advancements, we introduce the efficient deep learning system design from both software and hardware perspectives.
引用
收藏
页数:50
相关论文
共 50 条
  • [31] Deployment of Deep Learning Models to Mobile Devices for Spam Classification
    Zainab, Ameema
    Syed, Dabeeruddin
    Al-Thani, Dena
    2019 IEEE FIRST INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2019), 2019, : 112 - 117
  • [32] A Review on Methods and Applications in Multimodal Deep Learning
    Jabeen, Summaira
    Li, Xi
    Amin, Muhammad Shoib
    Bourahla, Omar
    Li, Songyuan
    Jabbar, Abdul
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (02)
  • [33] Scalable deep learning for healthcare: methods and applications
    Barillaro, Luca
    Agapito, Giuseppe
    Cannataro, Mario
    13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022, 2022,
  • [34] Targeted deep learning: Framework, methods, and applications
    Huang, Shih-Ting
    Lederer, Johannes
    STAT, 2023, 12 (01):
  • [35] Deep learning for smart manufacturing: Methods and applications
    Wang, Jinjiang
    Ma, Yulin
    Zhang, Laibin
    Gao, Robert X.
    Wu, Dazhong
    JOURNAL OF MANUFACTURING SYSTEMS, 2018, 48 : 144 - 156
  • [36] Thin film processing and integration methods to enable affordable mobile communications systems
    Cole, MW
    Nothwang, WD
    Joshi, PC
    Hirsch, S
    Demaree, JD
    INTEGRATED FERROELECTRICS, 2005, 71 : 29 - 44
  • [37] Suggested Collaborative Learning Conceptual Architecture and Applications for Mobile Devices
    Lee, Kwang
    Razaque, Abdul
    DESIGN, USER EXPERIENCE, AND USABILITY: THEORY, METHODS, TOOLS AND PRACTICE, PT 1, 2011, 6769 : 611 - 620
  • [38] Distance Learning through Mobile Devices - Some Problems and Applications
    Shkodrova, Rossitza
    Dochev, Danail
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2006, 6 (02) : 54 - 54
  • [39] Lightweight Deep Embeddings Fusion Methods for Face Verification on Mobile Devices
    Kim, Youngsam
    Cho, Kwantae
    Roh, Jong-Hyuk
    Cho, Sangrae
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1133 - 1136
  • [40] Machine learning and deep learning methods for wireless network applications
    Abel C. H. Chen
    Wen-Kang Jia
    Feng-Jang Hwang
    Genggeng Liu
    Fangying Song
    Lianrong Pu
    EURASIP Journal on Wireless Communications and Networking, 2022