RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices

被引:0
|
作者
Niu, Wei [1 ]
Sun, Mengshu [2 ]
Li, Zhengang [2 ]
Chen, Jou-An [3 ]
Guan, Jiexiong [1 ]
Shen, Xipeng [3 ]
Wang, Yanzhi [2 ]
Liu, Sijia [4 ]
Lin, Xue [2 ]
Ren, Bin [1 ]
机构
[1] William & Mary, Williamsburg, VA 23185 USA
[2] Northeastern Univ, Boston, MA 02115 USA
[3] North Carolina State Univ, Raleigh, NC 27695 USA
[4] Michigan State Univ, E Lansing, MI 48824 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mobile devices are becoming an important carrier for deep learning tasks, as they are being equipped with powerful, high-end mobile CPUs and GPUs. However, it is still a challenging task to execute 3D Convolutional Neural Networks (CNNs) targeting for real-time performance, besides high inference accuracy. The reason is more complex model structure and higher model dimensionality overwhelm the available computation/storage resources on mobile devices. A natural way may be turning to deep learning weight pruning techniques. However, the direct generalization of existing 2D CNN weight pruning methods to 3D CNNs is not ideal for fully exploiting mobile parallelism while achieving high inference accuracy. This paper proposes RT3D, a model compression and mobile acceleration framework for 3D CNNs, seamlessly integrating neural network weight pruning and compiler code generation techniques. We propose and investigate two structured sparsity schemes i.e., the vanilla structured sparsity and kernel group structured (KGS) sparsity that are mobile acceleration friendly. The vanilla sparsity removes whole kernel groups, while KGS sparsity is a more fine-grained structured sparsity that enjoys higher flexibility while exploiting full on-device parallelism. We propose a reweighted regularization pruning algorithm to achieve the proposed sparsity schemes. The inference time speedup due to sparsity is approaching the pruning rate of the whole model FLOPs (floating point operations). RT3D demonstrates up to 29.1 x speedup in end-to-end inference time comparing with current mobile frameworks supporting 3D CNNs, with moderate 1% similar to 1.5% accuracy loss. The endto-end inference time for 16 video frames could be within 150 ms, when executing representative C3D and R(2+1)D models on a cellphone. For the first time, real-time execution of 3D CNNs is achieved on off-the-shelf mobiles.
引用
收藏
页码:9179 / 9187
页数:9
相关论文
共 50 条
  • [1] Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks
    Ge, Liuhao
    Liang, Hui
    Yuan, Junsong
    Thalmann, Daniel
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (04) : 956 - 970
  • [2] Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
    Hedegaard, Lukas
    Iosifidis, Alexandros
    COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 369 - 385
  • [3] Using 3D Convolutional Neural Networks for Real-time Detection of Soccer Events
    Rongved, Olav A. Nergard
    Hicks, Steven A.
    Thambawita, Vajira
    Stensland, Hakon K.
    Zouganeli, Evi
    Johansen, Dag
    Midoglu, Cise
    Riegler, Michael A.
    Halvorsen, Pal
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2021, 15 (02) : 161 - 187
  • [4] A 3D Convolutional Neural Network Towards Real-time Amodal 3D Object Detection
    Sun, Hao
    Meng, Zehui
    Du, Xinxin
    Ang, Marcelo H., Jr.
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 8331 - 8338
  • [5] OpenGL & mobile devices - Real-time 3D graphics for handheld devices
    Wright, Richard S., Jr.
    DR DOBBS JOURNAL, 2006, 31 (06): : 30 - +
  • [6] Quantification of the volumetric flow rate using real-time 3D (RT3D) color Doppler echocardiography: An animal study
    Tsujino, H
    Jones, M
    Shiota, T
    Qin, JX
    Cardon, LA
    Morehead, AJ
    Zetts, AD
    Bauer, F
    Travaglini, A
    Greenberg, NL
    Panza, JA
    Thomas, JD
    JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2000, 35 (02) : 461A - 462A
  • [7] Real-Time Detection of Events in Soccer Videos using 3D Convolutional Neural Networks
    Rongved, Olav A. Norgard
    Hicks, Steven A.
    Thambawita, Vajira
    Stensland, Hakon K.
    Zouganeli, Evi
    Johansen, Dag
    Riegler, Michael A.
    Halvorsen, Pal
    2020 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2020), 2020, : 135 - 144
  • [8] RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving
    Zeng, Yiming
    Hu, Yu
    Liu, Shice
    Ye, Jing
    Han, Yinhe
    Li, Xiaowei
    Sun, Ninghui
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04): : 3434 - 3440
  • [9] Real-time monitoring of driver drowsiness on mobile platforms using 3D neural networks
    Wijnands, Jasper S.
    Thompson, Jason
    Nice, Kerry A.
    Aschwanden, Gideon D. P. A.
    Stevenson, Mark
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (13): : 9731 - 9743
  • [10] Real-time monitoring of driver drowsiness on mobile platforms using 3D neural networks
    Jasper S. Wijnands
    Jason Thompson
    Kerry A. Nice
    Gideon D. P. A. Aschwanden
    Mark Stevenson
    Neural Computing and Applications, 2020, 32 : 9731 - 9743