Video Action Recognition with Adaptive Zooming Using Motion Residuals

被引：1

作者：

Shahabinejad, Mostafa ^{[1
]}

Kezele, Irina ^{[1
]}

Nabavi, Seyed Shahabeddin ^{[1
]}

Liu, Wentao ^{[1
]}

Patel, Seel ^{[1
]}

Yu, Yuanhao ^{[1
]}

Wang, Yang ^{[2
]}

Tang, Jin ^{[1
]}

机构：

[1] Huawei Technol, Noahs Ark Labs, Markham, ON, Canada

[2] Concordia Univ, Montreal, PQ, Canada

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW | 2023年

关键词：

COGNITIVE NEUROSCIENCE;

D O I：

10.1109/ICCVW60793.2023.00131

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Motivated by the mechanisms of selective visual attention in humans, we put forward an efficient method for learning spatial attention with adaptive zooming for video action recognition. The learnt module can be used as a plug-in with any 3D CNN action recognition model with clip-level processing. We propose to use relevant motion clues from video frames to adaptively learn input-clip optimal transformations, as these clues are hypothesized to be directly related to the action recognition task. We employ differentiable transformations and samplers and ensure end-to-end system differentiability. We render the proposed module light-weight and computationally efficient, by exploiting the motion information inherently present in compressed videos and readily available at both training and inference time. Highly informative motion-related content of compressed video domain modalities helps further boost action recognition accuracy. Our experimental work demonstrates clear benefits of the proposed method for adaptive spatial zooming and of utilizing the compressed domain for that purpose.

引用

页码：1206 / 1215

页数：10

共 50 条

[21] PERCEPTUALLY ADAPTIVE VIDEO WATERMARKING USING MOTION ESTIMATION
Echizen, Isao
Fujii, Yasuhiro
Yamada, Takaaki
Tezuka, Satoru
Yoshiura, Hiroshi
INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2005, 5 (01) : 89 - 109
[22] Motion compensated video compression using adaptive transformations
Diab, Z
Cohen, P
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 2881 - 2884
[23] Action Recognition by Jointly Using Video Proposal and Trajectory
Qi, Lei
Lu, Xiaoqiang
Li, Xuelong
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2018), 2018,
[24] Unsupervised Video-Based Action Recognition With Imagining Motion and Perceiving Appearance
Lin, Wei
Liu, Xiaoyu
Zhuang, Yihong
Ding, Xinghao
Tu, Xiaotong
Huang, Yue
Zeng, Huanqiang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2245 - 2258
[25] M2A: Motion Aware Attention for Accurate Video Action Recognition
Gebotys, Brennan
Wong, Alexander
Clausi, David A.
2022 19TH CONFERENCE ON ROBOTS AND VISION (CRV 2022), 2022, : 83 - 89
[26] A method of badminton video motion recognition based on adaptive enhanced AdaBoost algorithm
Chang, Yuntao
INTERNATIONAL JOURNAL OF BIOMETRICS, 2025, 17 (1-2)
[27] Human action recognition using motion energy template
Shao, Yanhua
Guo, Yongcai
Gao, Chao
OPTICAL ENGINEERING, 2015, 54 (06)
[28] Pedestrian Action Recognition using Motion Type Classification
Hariyono, Joko
Jo, Kang-Hyun
2015 IEEE 2ND INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2015, : 129 - 132
[29] HUMAN ACTION RECOGNITION USING THE MOTION OF INTEREST POINTS
Monti, Francesco
Regazzoni, Carlo S.
2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 709 - 712
[30] Guiding robot motion using zooming and focusing
Zheng, JY
Sakai, T
Abe, N
IROS 96 - PROCEEDINGS OF THE 1996 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS - ROBOTIC INTELLIGENCE INTERACTING WITH DYNAMIC WORLDS, VOLS 1-3, 1996, : 1076 - 1082

← 1 2 3 4 5 →