RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices

被引：0

作者：

Niu, Wei ^{[1
]}

Sun, Mengshu ^{[2
]}

Li, Zhengang ^{[2
]}

Chen, Jou-An ^{[3
]}

Guan, Jiexiong ^{[1
]}

Shen, Xipeng ^{[3
]}

Wang, Yanzhi ^{[2
]}

Liu, Sijia ^{[4
]}

Lin, Xue ^{[2
]}

Ren, Bin ^{[1
]}

机构：

[1] William & Mary, Williamsburg, VA 23185 USA

[2] Northeastern Univ, Boston, MA 02115 USA

[3] North Carolina State Univ, Raleigh, NC 27695 USA

[4] Michigan State Univ, E Lansing, MI 48824 USA

来源：

THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2021年 / 35卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Mobile devices are becoming an important carrier for deep learning tasks, as they are being equipped with powerful, high-end mobile CPUs and GPUs. However, it is still a challenging task to execute 3D Convolutional Neural Networks (CNNs) targeting for real-time performance, besides high inference accuracy. The reason is more complex model structure and higher model dimensionality overwhelm the available computation/storage resources on mobile devices. A natural way may be turning to deep learning weight pruning techniques. However, the direct generalization of existing 2D CNN weight pruning methods to 3D CNNs is not ideal for fully exploiting mobile parallelism while achieving high inference accuracy. This paper proposes RT3D, a model compression and mobile acceleration framework for 3D CNNs, seamlessly integrating neural network weight pruning and compiler code generation techniques. We propose and investigate two structured sparsity schemes i.e., the vanilla structured sparsity and kernel group structured (KGS) sparsity that are mobile acceleration friendly. The vanilla sparsity removes whole kernel groups, while KGS sparsity is a more fine-grained structured sparsity that enjoys higher flexibility while exploiting full on-device parallelism. We propose a reweighted regularization pruning algorithm to achieve the proposed sparsity schemes. The inference time speedup due to sparsity is approaching the pruning rate of the whole model FLOPs (floating point operations). RT3D demonstrates up to 29.1 x speedup in end-to-end inference time comparing with current mobile frameworks supporting 3D CNNs, with moderate 1% similar to 1.5% accuracy loss. The endto-end inference time for 16 video frames could be within 150 ms, when executing representative C3D and R(2+1)D models on a cellphone. For the first time, real-time execution of 3D CNNs is achieved on off-the-shelf mobiles.

引用

页码：9179 / 9187

页数：9

共 50 条

[1] Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks
Ge, Liuhao
Liang, Hui
Yuan, Junsong
Thalmann, Daniel
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (04) : 956 - 970
[2] Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
Hedegaard, Lukas
Iosifidis, Alexandros
COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 369 - 385
[3] Using 3D Convolutional Neural Networks for Real-time Detection of Soccer Events
Rongved, Olav A. Nergard
Hicks, Steven A.
Thambawita, Vajira
Stensland, Hakon K.
Zouganeli, Evi
Johansen, Dag
Midoglu, Cise
Riegler, Michael A.
Halvorsen, Pal
INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2021, 15 (02) : 161 - 187
[4] A 3D Convolutional Neural Network Towards Real-time Amodal 3D Object Detection
Sun, Hao
Meng, Zehui
Du, Xinxin
Ang, Marcelo H., Jr.
2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 8331 - 8338
[5] OpenGL & mobile devices - Real-time 3D graphics for handheld devices
Wright, Richard S., Jr.
DR DOBBS JOURNAL, 2006, 31 (06): : 30 - +
[6] Quantification of the volumetric flow rate using real-time 3D (RT3D) color Doppler echocardiography: An animal study
Tsujino, H
Jones, M
Shiota, T
Qin, JX
Cardon, LA
Morehead, AJ
Zetts, AD
Bauer, F
Travaglini, A
Greenberg, NL
Panza, JA
Thomas, JD
JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2000, 35 (02) : 461A - 462A
[7] Real-Time Detection of Events in Soccer Videos using 3D Convolutional Neural Networks
Rongved, Olav A. Norgard
Hicks, Steven A.
Thambawita, Vajira
Stensland, Hakon K.
Zouganeli, Evi
Johansen, Dag
Riegler, Michael A.
Halvorsen, Pal
2020 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2020), 2020, : 135 - 144
[8] RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving
Zeng, Yiming
Hu, Yu
Liu, Shice
Ye, Jing
Han, Yinhe
Li, Xiaowei
Sun, Ninghui
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04): : 3434 - 3440
[9] Real-time monitoring of driver drowsiness on mobile platforms using 3D neural networks
Wijnands, Jasper S.
Thompson, Jason
Nice, Kerry A.
Aschwanden, Gideon D. P. A.
Stevenson, Mark
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (13): : 9731 - 9743
[10] Real-time monitoring of driver drowsiness on mobile platforms using 3D neural networks
Jasper S. Wijnands
Jason Thompson
Kerry A. Nice
Gideon D. P. A. Aschwanden
Mark Stevenson
Neural Computing and Applications, 2020, 32 : 9731 - 9743

← 1 2 3 4 5 →