Video Action Recognition with Adaptive Zooming Using Motion Residuals

被引:1
|
作者
Shahabinejad, Mostafa [1 ]
Kezele, Irina [1 ]
Nabavi, Seyed Shahabeddin [1 ]
Liu, Wentao [1 ]
Patel, Seel [1 ]
Yu, Yuanhao [1 ]
Wang, Yang [2 ]
Tang, Jin [1 ]
机构
[1] Huawei Technol, Noahs Ark Labs, Markham, ON, Canada
[2] Concordia Univ, Montreal, PQ, Canada
关键词
COGNITIVE NEUROSCIENCE;
D O I
10.1109/ICCVW60793.2023.00131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Motivated by the mechanisms of selective visual attention in humans, we put forward an efficient method for learning spatial attention with adaptive zooming for video action recognition. The learnt module can be used as a plug-in with any 3D CNN action recognition model with clip-level processing. We propose to use relevant motion clues from video frames to adaptively learn input-clip optimal transformations, as these clues are hypothesized to be directly related to the action recognition task. We employ differentiable transformations and samplers and ensure end-to-end system differentiability. We render the proposed module light-weight and computationally efficient, by exploiting the motion information inherently present in compressed videos and readily available at both training and inference time. Highly informative motion-related content of compressed video domain modalities helps further boost action recognition accuracy. Our experimental work demonstrates clear benefits of the proposed method for adaptive spatial zooming and of utilizing the compressed domain for that purpose.
引用
收藏
页码:1206 / 1215
页数:10
相关论文
共 50 条
  • [11] Video Action Retrieval Using Action Recognition Model
    Iinuma, Yuko
    Satoh, Shin'ichi
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 603 - 606
  • [12] An efficient motion visual learning method for video action recognition
    Wang, Bin
    Chang, Faliang
    Liu, Chunsheng
    Wang, Wenqian
    Ma, Ruiyi
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [13] Manet: motion-aware network for video action recognition
    Li, Xiaoyang
    Yang, Wenzhu
    Wang, Kanglin
    Wang, Tiebiao
    Zhang, Chen
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (03)
  • [14] HUMAN ACTION RECOGNITION USING ADAPTIVE HIERARCHICAL DEPTH MOTION MAPS AND GABOR FILTER
    Liu, Hong
    He, Qinqin
    Liu, Mengyuan
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 1432 - 1436
  • [15] Adaptive video transmission using motion intensity
    Yu, Junqing
    Liu, Chong
    He, Yunfeng
    Hu, Shenghong
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2009, 21 (06): : 847 - 852
  • [16] Adaptive video watermarking using motion information
    Lee, CH
    Oh, HS
    Lee, HK
    SECURITY AND WATERMARKING OF MULTIMEDIA CONTENTS II, 2000, 3971 : 209 - 216
  • [17] Multipath Attention and Adaptive Gating Network for Video Action Recognition
    Haiping Zhang
    Zepeng Hu
    Dongjin Yu
    Liming Guan
    Xu Liu
    Conghao Ma
    Neural Processing Letters, 56
  • [18] Multipath Attention and Adaptive Gating Network for Video Action Recognition
    Zhang, Haiping
    Hu, Zepeng
    Yu, Dongjin
    Guan, Liming
    Liu, Xu
    Ma, Conghao
    NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [19] Action Recognition Using Form and Motion Modalities
    Meng, Quanling
    Zhu, Heyan
    Zhang, Weigang
    Piao, Xuefeng
    Zhang, Aijie
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (01)
  • [20] Adaptive Recognition of Motion Posture in Sports Video Based on Evolution Equation
    Yuan, Rui
    Zhang, Zhendong
    Le, Yanyan
    Chen, Enqing
    ADVANCES IN MATHEMATICAL PHYSICS, 2021, 2021