Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

被引：49

作者：

Cai, Han ^{[1
]}

Lin, Ji ^{[1
]}

Lin, Yujun ^{[1
]}

Liu, Zhijian ^{[1
]}

Tang, Haotian ^{[1
]}

Wang, Hanrui ^{[1
]}

Zhu, Ligeng ^{[1
]}

Han, Song ^{[1
]}

机构：

[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA

来源：

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS | 2022年 / 27卷 / 03期

关键词：

Efficient deep learning; TinyML; model compression; AutoML; neural architecture search; NEURAL-NETWORK ACCELERATOR; ARCHITECTURE; IMPLEMENTATION; COPROCESSOR; PREDICTION; MODEL;

D O I：

10.1145/3486618

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing, and speech recognition. However, their superior performance comes at the considerable cost of computational complexity, which greatly hinders their applications in many resource-constrained devices, such as mobile phones and Internet of Things (IoT) devices. Therefore, methods and techniques that are able to lift the efficiency bottleneck while preserving the high accuracy of DNNs are in great demand to enable numerous edge AI applications. This article provides an overview of efficient deep learning methods, systems, and applications. We start from introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design. To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization. We then cover efficient on-device training to enable user customization based on the local data on mobile devices. Apart from general acceleration techniques, we also showcase several task-specific accelerations for point cloud, video, and natural language processing by exploiting their spatial sparsity and temporal/token redundancy. Finally, to support all these algorithmic advancements, we introduce the efficient deep learning system design from both software and hardware perspectives.

引用

页数：50

共 50 条

[1] Cooperation of Mobile Devices for Fast Inference of Deep Learning Applications
Qinglin Yang
Xiaofei Luo
Peng Li
Toshiaki Miyazaki
Wenfeng Shen
Weiqin Tong
Mobile Networks and Applications, 2021, 26 : 1243 - 1249
[2] Cooperation of Mobile Devices for Fast Inference of Deep Learning Applications
Yang, Qinglin
Luo, Xiaofei
Li, Peng
Miyazaki, Toshiaki
Shen, Wenfeng
Tong, Weiqin
MOBILE NETWORKS & APPLICATIONS, 2021, 26 (03): : 1243 - 1249
[3] Deep learning neural networks: Methods, systems, and applications
Wei, Qinglai
Kasabov, Nikola
Polycarpou, Marios
Zeng, Zhigang
NEUROCOMPUTING, 2020, 396 : 130 - 132
[4] Deep Learning on Mobile Devices - A Review
Deng, Yunbin
MOBILE MULTIMEDIA/IMAGE PROCESSING, SECURITY, AND APPLICATIONS 2019, 2019, 10993
[5] A Survey of Deep Learning on Mobile Devices: Applications, Optimizations, Challenges, and Research Opportunities
Zhao, Tianming
Xie, Yucheng
Wang, Yan
Cheng, Jerry
Guo, Xiaonan
Hu, Bin
Chen, Yingying
PROCEEDINGS OF THE IEEE, 2022, 110 (03) : 334 - 354
[6] Deep Learning on Mobile Systems
Curukoglu, Nur
Ozyildirim, Buse Melis
2018 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2018, : 179 - 182
[7] Deep Learning: Methods and Applications
Deng, Li
Yu, Dong
FOUNDATIONS AND TRENDS IN SIGNAL PROCESSING, 2013, 7 (3-4): : I - 387
[8] Deep learning for face recognition on mobile devices
Rios-Sanchez, Belen
Costa-da Silva, David
Martin-Yuste, Natalia
Sanchez-Avila, Carmen
IET BIOMETRICS, 2020, 9 (03) : 109 - 117
[9] Squeezing Deep Learning into Mobile and Embedded Devices
Lane, Nicholas D.
Bhattacharya, Sourav
Mathur, Akhil
Georgiev, Petko
Forlivesi, Claudio
Kawsar, Fahim
IEEE PERVASIVE COMPUTING, 2017, 16 (03) : 82 - 88
[10] Deep Learning for Text Data on Mobile Devices
Sido, Jakub
Konopik, Miloslav
2019 24TH INTERNATIONAL CONFERENCE ON APPLIED ELECTRONICS (AE), 2019, : 147 - 150

← 1 2 3 4 5 →