Video Action Recognition with Adaptive Zooming Using Motion Residuals

被引:1
|
作者
Shahabinejad, Mostafa [1 ]
Kezele, Irina [1 ]
Nabavi, Seyed Shahabeddin [1 ]
Liu, Wentao [1 ]
Patel, Seel [1 ]
Yu, Yuanhao [1 ]
Wang, Yang [2 ]
Tang, Jin [1 ]
机构
[1] Huawei Technol, Noahs Ark Labs, Markham, ON, Canada
[2] Concordia Univ, Montreal, PQ, Canada
关键词
COGNITIVE NEUROSCIENCE;
D O I
10.1109/ICCVW60793.2023.00131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Motivated by the mechanisms of selective visual attention in humans, we put forward an efficient method for learning spatial attention with adaptive zooming for video action recognition. The learnt module can be used as a plug-in with any 3D CNN action recognition model with clip-level processing. We propose to use relevant motion clues from video frames to adaptively learn input-clip optimal transformations, as these clues are hypothesized to be directly related to the action recognition task. We employ differentiable transformations and samplers and ensure end-to-end system differentiability. We render the proposed module light-weight and computationally efficient, by exploiting the motion information inherently present in compressed videos and readily available at both training and inference time. Highly informative motion-related content of compressed video domain modalities helps further boost action recognition accuracy. Our experimental work demonstrates clear benefits of the proposed method for adaptive spatial zooming and of utilizing the compressed domain for that purpose.
引用
收藏
页码:1206 / 1215
页数:10
相关论文
共 50 条
  • [21] PERCEPTUALLY ADAPTIVE VIDEO WATERMARKING USING MOTION ESTIMATION
    Echizen, Isao
    Fujii, Yasuhiro
    Yamada, Takaaki
    Tezuka, Satoru
    Yoshiura, Hiroshi
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2005, 5 (01) : 89 - 109
  • [22] Motion compensated video compression using adaptive transformations
    Diab, Z
    Cohen, P
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 2881 - 2884
  • [23] Action Recognition by Jointly Using Video Proposal and Trajectory
    Qi, Lei
    Lu, Xiaoqiang
    Li, Xuelong
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2018), 2018,
  • [24] Unsupervised Video-Based Action Recognition With Imagining Motion and Perceiving Appearance
    Lin, Wei
    Liu, Xiaoyu
    Zhuang, Yihong
    Ding, Xinghao
    Tu, Xiaotong
    Huang, Yue
    Zeng, Huanqiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2245 - 2258
  • [25] M2A: Motion Aware Attention for Accurate Video Action Recognition
    Gebotys, Brennan
    Wong, Alexander
    Clausi, David A.
    2022 19TH CONFERENCE ON ROBOTS AND VISION (CRV 2022), 2022, : 83 - 89
  • [26] A method of badminton video motion recognition based on adaptive enhanced AdaBoost algorithm
    Chang, Yuntao
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2025, 17 (1-2)
  • [27] Human action recognition using motion energy template
    Shao, Yanhua
    Guo, Yongcai
    Gao, Chao
    OPTICAL ENGINEERING, 2015, 54 (06)
  • [28] Pedestrian Action Recognition using Motion Type Classification
    Hariyono, Joko
    Jo, Kang-Hyun
    2015 IEEE 2ND INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2015, : 129 - 132
  • [29] HUMAN ACTION RECOGNITION USING THE MOTION OF INTEREST POINTS
    Monti, Francesco
    Regazzoni, Carlo S.
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 709 - 712
  • [30] Guiding robot motion using zooming and focusing
    Zheng, JY
    Sakai, T
    Abe, N
    IROS 96 - PROCEEDINGS OF THE 1996 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS - ROBOTIC INTELLIGENCE INTERACTING WITH DYNAMIC WORLDS, VOLS 1-3, 1996, : 1076 - 1082