Fine-grained action recognition using multi-view attentions

被引:16
|
作者
Zhu, Yisheng [1 ]
Liu, Guangcan [1 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Nanjing, Peoples R China
来源
VISUAL COMPUTER | 2020年 / 36卷 / 09期
基金
中国国家自然科学基金;
关键词
Multi-view attention; Action recognition; Deep neural networks;
D O I
10.1007/s00371-019-01770-y
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Inflated 3D ConvNet (I3D) utilizes 3D convolution to enrich semantic information of features, forming a strong baseline for human action recognition. However, 3D convolution extracts features by mixing spatial, temporal and cross-channel information together, lacking the ability to emphasize meaningful features along specific dimensions, especially for the cross-channel information, which is, however, of crucial importance in recognizing fine-grained actions. In this paper, we propose a novel multi-view attention mechanism, named channel-spatial-temporal attention (CSTA) block, to guide the network to pay more attention to the clues useful for fine-grained action recognition. Specifically, CSTA consists of three branches: channel-spatial branch, channel-temporal branch and spatial-temporal branch. By directly plugging these branches into I3D, we further explore the impact of location information as well as the number of blocks in terms of recognition accuracy. We also examine two different strategies for designing a mixture of multiple CSTA blocks. Extensive experiments demonstrate the effectiveness of our CSTA. Namely, while using only RGB frames to train the network, I3D equipped with CSTA (I3D-CSTA) achieves accuracies of 95.76% and 73.97% on UCF101 and HMDB51, respectively. These results are indeed comparable with the results produced by the methods using both RGB frames and optical flow. Even more, with the assistance of optical flow, the recognition accuracies of CSTA-I3D rise to 98.2% on UCF101 and 82.9% on HMDB51, outperforming many state-of-the-art methods.
引用
收藏
页码:1771 / 1781
页数:11
相关论文
共 50 条
  • [1] Fine-grained action recognition using multi-view attentions
    Yisheng Zhu
    Guangcan Liu
    [J]. The Visual Computer, 2020, 36 : 1771 - 1781
  • [2] Multi-View Active Fine-Grained Visual Recognition
    Du, Ruoyi
    Yu, Wenqing
    Wang, Heqing
    Lin, Ting-En
    Chang, Dongliang
    Ma, Zhanyu
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1568 - 1578
  • [3] Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering
    College of Computer Science and Technology, Zhejiang University, China
    [J]. arXiv,
  • [4] Fine-Grained Graph Learning for Multi-View Subspace Clustering
    Wang, Yidi
    Pei, Xiaobing
    Zhan, Haoxi
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2804 - 2815
  • [5] Fine-grained multi-view clustering with robust multi-prototypes representation
    Hongwei Yin
    Guixiang Wang
    Wenjun Hu
    Zhao Zhang
    [J]. Applied Intelligence, 2023, 53 : 8402 - 8420
  • [6] Multi-View Fine-Grained Vehicle Classification with Multi-Loss Learning
    Silva, Bruno
    Barbosa-Anda, Francisco Rodolfo
    Batista, Jorge
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC), 2021, : 209 - 214
  • [7] TWEETSPIN: Fine-grained Propaganda Detection in Social Media Using Multi-View Representations
    Vijayaraghavan, Prashanth
    Vosoughi, Soroush
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3433 - 3448
  • [8] Fine-grained multi-view clustering with robust multi-prototypes representation
    Yin, Hongwei
    Wang, Guixiang
    Hu, Wenjun
    Zhang, Zhao
    [J]. APPLIED INTELLIGENCE, 2023, 53 (07) : 8402 - 8420
  • [9] M3Net: Multi-view Encoding, Matching, and Fusion for Few-shot Fine-grained Action Recognition
    Tang, Hao
    Liu, Jun
    Yan, Shuanglin
    Yan, Rui
    Li, Zechao
    Tang, Jinhui
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1719 - 1728
  • [10] An Effective Augmented Lagrangian Method for Fine-Grained Multi-View Optimization
    Tan, Yuze
    Cai, Hecheng
    Huang, Shudong
    Wei, Shuping
    Yang, Fan
    Lv, Jiancheng
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15258 - 15266