Boxless Action Recognition in Still Images via Recurrent Visual Attention

被引:1
|
作者
Feng, Weijiang [1 ]
Zhang, Xiang [1 ]
Huang, Xuhui [1 ]
Luo, Zhigang [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Convolutional neural network; Soft attention; Recurrent neural network; Spatial pyramid pooling;
D O I
10.1007/978-3-319-70096-0_68
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Boxless action recognition in still images means recognizing human actions in the absence of ground-truth bounding boxes. Since no ground-truth bounding boxes are provided, boxless action recognition is more challenging than traditional action recognition tasks. Towards this end, AttSPP-net jointly integrates soft attention and spatial pyramid pooling into a convolutional neural network, and achieves comparable recognition accuracies even with some bounding box based approaches. However, the soft attention of AttSPP-net concentrates on only one fixation, rather than combining information from different fixations over time, which is the mechanism of human visual attention. In this paper, we take inspiration from this mechanism and propose a ReAttSPP-net for boxless action recognition. ReAttSPP-net utilizes a recurrent neural network model of visual attention in order to extract information from a sequence of fixations. Experiments on three public action recognition benchmark datasets including PASCAL VOC 2012, Willow and Sports demonstrate that ReAttSPP-net can achieve promising results and obtains higher recognition performance than AttSPP-net.
引用
收藏
页码:663 / 673
页数:11
相关论文
共 50 条
  • [1] Attention Focused Spatial Pyramid Pooling for Boxless Action Recognition in Still Images
    Feng, Weijiang
    Zhang, Xiang
    Huang, Xuhui
    Luo, Zhigang
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 574 - 581
  • [2] Patch excitation network for boxless action recognition in still images
    Liang, Shuang
    Wang, Jiewen
    Zhuang, Zikun
    [J]. VISUAL COMPUTER, 2024, 40 (06): : 4099 - 4113
  • [3] Spatial attention based visual semantic learning for action recognition in still images
    Zheng, Yunpeng
    Zheng, Xiangtao
    Lu, Xiaoqiang
    Wu, Siyuan
    [J]. NEUROCOMPUTING, 2020, 413 : 383 - 396
  • [4] Action Recognition with Visual Attention on Skeleton Images
    Yang, Zhengyuan
    Li, Yuncheng
    Yang, Jianchao
    Luo, Jiebo
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3309 - 3314
  • [5] Multibranch Attention Networks for Action Recognition in Still Images
    Yan, Shiyang
    Smith, Jeremy S.
    Lu, Wenjin
    Zhang, Bailing
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2018, 10 (04) : 1116 - 1125
  • [6] Multimodal Fusion with Cross-Modal Attention for Action Recognition in Still Images
    Tsai, Jia-Hua
    Chu, Wei-Ta
    [J]. PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,
  • [7] Coloring Action Recognition in Still Images
    Khan, Fahad Shahbaz
    Anwer, Rao Muhammad
    van de Weijer, Joost
    Bagdanov, Andrew D.
    Lopez, Antonio M.
    Felsberg, Michael
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 105 (03) : 205 - 221
  • [8] Coloring Action Recognition in Still Images
    Fahad Shahbaz Khan
    Rao Muhammad Anwer
    Joost van de Weijer
    Andrew D. Bagdanov
    Antonio M. Lopez
    Michael Felsberg
    [J]. International Journal of Computer Vision, 2013, 105 : 205 - 221
  • [9] Understanding action recognition in still images
    Girish, Deeptha
    Singh, Vineeta
    Ralescu, Anca
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 1523 - 1529
  • [10] Human Action Recognition in Still Images
    Palak
    Chaudhary, Sachin
    [J]. Communications in Computer and Information Science, 2022, 1568 CCIS : 483 - 493