Continual 3D Convolutional Neural Networks for Real-time Processing of Videos

被引:3
|
作者
Hedegaard, Lukas [1 ]
Iosifidis, Alexandros [1 ]
机构
[1] Aarhus Univ, Dept Elect & Comp Engn, Aarhus, Denmark
来源
关键词
3D CNN; Human activity recognition; Efficient; Stream processing; Online inference; Continual inference network;
D O I
10.1007/978-3-031-19772-7_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce Continual 3D Convolutional Neural Networks (Co3D CNNs), a new computational formulation of spatio-temporal 3D CNNs, in which videos are processed frame-by-frame rather than by clip. In online tasks demanding frame-wise predictions, Co3D CNNs dispense with the computational redundancies of regular 3D CNNs, namely the repeated convolutions over frames, which appear in overlapping clips. We show that Continual 3D CNNs can reuse preexisting 3D-CNN weights to reduce the per-prediction floating point operations (FLOPs) in proportion to the temporal receptive field while retaining similar memory requirements and accuracy. This is validated with multiple models on Kinetics-400 and Charades with remarkable results: CoX3D models attain state-of-the-art complexity/accuracy trade-offs on Kinetics-400 with 12.1-15.3x reductions of FLOPs and 2.3-3.8% improvements in accuracy compared to regular X3D models while reducing peak memory consumption by up to 48%. Moreover, we investigate the transient response of Co3D CNNs at start-up and perform extensive benchmarks of on-hardware processing characteristics for publicly available 3D CNNs.
引用
收藏
页码:369 / 385
页数:17
相关论文
共 50 条
  • [21] Sequentially-trained, Shallow Neural Networks for Real-time 3D Odometry
    Rodriguez, Frank
    Muminov, Baurzhan
    Vuong, Luat T.
    ARTIFICIAL INTELLIGENCE FOR SECURITY AND DEFENCE APPLICATIONS, 2023, 12742
  • [22] Optimal control for real-time visualization and 3D rendering using neural networks
    Yuan, YW
    Zhan, HH
    Yan, LM
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 3460 - 3463
  • [23] Real time violence detection in surveillance videos using Convolutional Neural Networks
    Irfanullah
    Hussain, Tariq
    Iqbal, Arshad
    Yang, Bailin
    Hussain, Altaf
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (26) : 38151 - 38173
  • [24] High-speed 3D scanner with real-time 3D processing
    Lavelle, JP
    Schuet, SR
    Schuet, DJ
    NONDESTRUCTIVE EVALUATION AND HEALTH MONITORING OF AEROSPACE MATERIALS AND COMPOSITES III, 2004, 5393 : 19 - 28
  • [25] High-speed 3D scanner with real-time 3D processing
    Lavelle, JP
    Schuet, SR
    Schuet, DJ
    TWO- AND THREE-DIMENSIONAL VISION SYSTEMS FOR INSPECTION, CONTROL, AND METROLOGY, 2004, 5265 : 179 - 188
  • [26] Detecting the Absence of Lung Sliding in Ultrasound Videos Using 3D Convolutional Neural Networks
    Kolarik, Michal
    Sarnovsky, Martin
    Paralic, Jan
    ACTA POLYTECHNICA HUNGARICA, 2023, 20 (06) : 47 - 60
  • [27] Violence Detection in Videos by Combining 3D Convolutional Neural Networks and Support Vector Machines
    Accattoli, Simone
    Sernani, Paolo
    Falcionelli, Nicola
    Mekuria, Dagmawi Neway
    Dragoni, Aldo Franco
    APPLIED ARTIFICIAL INTELLIGENCE, 2020, 34 (04) : 329 - 344
  • [28] Action Recognition in Videos with Spatio-Temporal Fusion 3D Convolutional Neural Networks
    Y. Wang
    X. J. Shen
    H. P. Chen
    J. X. Sun
    Pattern Recognition and Image Analysis, 2021, 31 : 580 - 587
  • [29] Real-Time Motor Fault Detection by 1-D Convolutional Neural Networks
    Ince, Turker
    Kiranyaz, Serkan
    Eren, Levent
    Askar, Murat
    Gabbouj, Moncef
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2016, 63 (11) : 7067 - 7075
  • [30] Real-Time Gesture Recognition Using 3D Sensory Data and a Light Convolutional Neural Network
    Diliberti, Nicholas
    Peng, Chao
    Kauffman, Christopher
    Dong, Yangzi
    Hansberger, Jeffrey T.
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 401 - 410