Mixed Attention and Channel Shift Transformer for Efficient Action Recognition

被引：0

作者：

Lu, Xiusheng ^{[1
]}

Hao, Yanbin ^{[2
]}

Cheng, Lechao ^{[3
]}

Zhao, Sicheng ^{[1
]}

Li, Yutao ^{[4
]}

Song, Mingli ^{[5
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] Univ Sci & Technol China, Hefei, Peoples R China

[3] Hefei Univ Technol, Hefei, Peoples R China

[4] Ocean Univ China, Qingdao, Peoples R China

[5] Zhejiang Univ, Hangzhou, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2025年 / 21卷 / 03期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Action recognition; mixed attention; random attention; channel shift;

D O I：

10.1145/3712594

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The practical use of the Transformer-based methods for processing videos is constrained by the high computing complexity. Although previous approaches adopt the spatiotemporal decomposition of 3D attention to mitigate the issue, they suffer from the drawback of neglecting the majority of visual tokens. This article presents a novel mixed attention operation that subtly fuses the random, spatial, and temporal attention mechanisms. The proposed random attention stochastically samples video tokens in a simple yet effective way, complementing other attention methods. Furthermore, since the attention operation concentrates on learning long-distance relationships, we employ the channel shift operation to encode short-term temporal characteristics. Our model can provide more comprehensive motion representations thanks to the amalgamation of these techniques. Experimental results show that the proposed method produces competitive action recognition results with low computational overhead on both large-scale and small-scale public video datasets.

引用

页数：20

共 50 条

[21] Tennis Action Recognition Based on Multi-Branch Mixed Attention
Zhou, Xianwei
Chen, Weitao
Li, Zhenfeng
Li, Yuan
Lei, Jiale
Yu, Songsen
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023, 2023, 14118 : 162 - 175
[22] MSAHTA: Mixed Spatial Attention and Hierarchical Temporal Aggregation for Action Recognition
Feng, Jinyuan
Yang, Dan
Ge, Yongxin
Qin, Xiaolei
Chen, Yida
Wang, Yuangan
2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 775 - 782
[23] STSM: Spatio-Temporal Shift Module for Efficient Action Recognition
Yang, Zhaoqilin
An, Gaoyun
Zhang, Ruichen
MATHEMATICS, 2022, 10 (18)
[24] k-NN attention-based video vision transformer for action recognition
Sun, Weirong
Ma, Yujun
Wang, Ruili
NEUROCOMPUTING, 2024, 574
[25] Supervised Spatial Transformer Networks for Attention Learning in Fine-grained Action Recognition
Liu, Dichao
Wang, Yu
Kato, Jien
VISAPP: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 4, 2019, : 311 - 318
[26] An Effective Video Transformer With Synchronized Spatiotemporal and Spatial Self-Attention for Action Recognition
Alfasly, Saghir
Chui, Charles K.
Jiang, Qingtang
Lu, Jian
Xu, Chen
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2496 - 2509
[27] Graph transformer network with temporal kernel attention for skeleton-based action recognition
Liu, Yanan
Zhang, Hao
Xu, Dan
He, Kangjian
KNOWLEDGE-BASED SYSTEMS, 2022, 240
[28] Graph transformer network with temporal kernel attention for skeleton-based action recognition
Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming
650504, China
Knowl Based Syst,
[29] MixTConv: Mixed Temporal Convolutional Kernels for Efficient Action Recognition
Shan, Kaiyu
Wang, Yongtao
Tang, Zhi
Chen, Ying
Li, Yangyan
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1751 - 1756
[30] EstraNet: An Efficient Shift-Invariant Transformer Network for Side-Channel Analysis
Hajra S.
Chowdhury S.
Mukhopadhyay D.
IACR Transactions on Cryptographic Hardware and Embedded Systems, 2023, 2024 (01): : 336 - 374

← 1 2 3 4 5 →