Video Action Recognition with Adaptive Zooming Using Motion Residuals

被引:1
|
作者
Shahabinejad, Mostafa [1 ]
Kezele, Irina [1 ]
Nabavi, Seyed Shahabeddin [1 ]
Liu, Wentao [1 ]
Patel, Seel [1 ]
Yu, Yuanhao [1 ]
Wang, Yang [2 ]
Tang, Jin [1 ]
机构
[1] Huawei Technol, Noahs Ark Labs, Markham, ON, Canada
[2] Concordia Univ, Montreal, PQ, Canada
关键词
COGNITIVE NEUROSCIENCE;
D O I
10.1109/ICCVW60793.2023.00131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Motivated by the mechanisms of selective visual attention in humans, we put forward an efficient method for learning spatial attention with adaptive zooming for video action recognition. The learnt module can be used as a plug-in with any 3D CNN action recognition model with clip-level processing. We propose to use relevant motion clues from video frames to adaptively learn input-clip optimal transformations, as these clues are hypothesized to be directly related to the action recognition task. We employ differentiable transformations and samplers and ensure end-to-end system differentiability. We render the proposed module light-weight and computationally efficient, by exploiting the motion information inherently present in compressed videos and readily available at both training and inference time. Highly informative motion-related content of compressed video domain modalities helps further boost action recognition accuracy. Our experimental work demonstrates clear benefits of the proposed method for adaptive spatial zooming and of utilizing the compressed domain for that purpose.
引用
收藏
页码:1206 / 1215
页数:10
相关论文
共 50 条
  • [1] Human Action Recognition Using Accumulated Motion and Gradient of Motion from Video
    Thanikachalam, V.
    Thyagharajan, K. K.
    2012 THIRD INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION & NETWORKING TECHNOLOGIES (ICCCNT), 2012,
  • [2] Slow motion replay of video sequences using fractal zooming
    Giusto, DD
    Murroni, M
    Soro, G
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2005, 51 (01) : 103 - 111
  • [3] Slow motion replay of video sequences using fractal zooming
    Giusto, DD
    Murroni, M
    Soro, G
    ICCE: 2005 INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, DIGEST OF TECHNICAL PAPERS, 2005, : 215 - 216
  • [4] Action Recognition in Surveillance Video Using ConvNets and Motion History Image
    Luo, Sheng
    Yang, Haojin
    Wang, Cheng
    Che, Xiaoyin
    Meinel, Christoph
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 187 - 195
  • [5] Human Action Recognition Using Adaptive Local Motion Descriptor in Spark
    Uddin, M. D. Azher
    Joolee, Joolekha Bibi
    Alam, Aftab
    Lee, Young-Koo
    IEEE ACCESS, 2017, 5 : 21157 - 21167
  • [6] Spatio-temporal adaptive convolution and bidirectional motion difference fusion for video action recognition
    Li, Linxi
    Tang, Mingwei
    Yang, Zhendong
    Hu, Jie
    Zhao, Mingfeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [7] Motion Feature Combination for Human Action Recognition in Video
    Meng, Hongying
    Pears, Nick
    Bailey, Chris
    COMPUTER VISION AND COMPUTER GRAPHICS, 2008, 21 : 151 - +
  • [8] TOWARDS TEMPORAL ADAPTIVE REPRESENTATION FOR VIDEO ACTION RECOGNITION
    Cai, Junjie
    Yu, Jie
    Imai, Francisco
    Tian, Qi
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4155 - 4159
  • [9] Video Action Recognition Using Motion and Multi-View Excitation with Temporal Aggregation
    Joefrie, Yuri Yudhaswana
    Aono, Masaki
    ENTROPY, 2022, 24 (11)
  • [10] Video rendering: Zooming video using fractals
    Murroni, Maurizio
    Soro, Giulio
    VISUAL CONTENT PROCESSING AND REPRESENTATION, 2006, 3893 : 84 - 91