Top-Down Deep Appearance Attention for Action Recognition

被引:0
|
作者
Anwer, Rao Muhammad [1 ]
Khan, Fahad Shahbaz [2 ]
de Weijer, Joost van [3 ]
Laaksonen, Jorma [1 ]
机构
[1] Aalto Univ, Sch Sci, Dept Comp Sci, Espoo, Finland
[2] Linkoping Univ, Comp Vis Lab, Linkoping, Sweden
[3] Univ Autonoma Barcelona, Comp Vis Ctr, CS Dept, Barcelona, Spain
来源
IMAGE ANALYSIS, SCIA 2017, PT I | 2017年 / 10269卷
基金
芬兰科学院;
关键词
Action recognition; CNNs; Feature fusion; FEATURES;
D O I
10.1007/978-3-319-59126-1_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing human actions in videos is a challenging problem in computer vision. Recently, convolutional neural network based deep features have shown promising results for action recognition. In this paper, we investigate the problem of fusing deep appearance and motion cues for action recognition. We propose a video representation which combines deep appearance and motion based local convolutional features within the bag-of-deep-features framework. Firstly, dense deep appearance and motion based local convolutional features are extracted from spatial (RGB) and temporal (flow) networks, respectively. Both visual cues are processed in parallel by constructing separate visual vocabularies for appearance and motion. A category-specific appearance map is then learned to modulate the weights of the deep motion features. The proposed representation is discriminative and binds the deep local convolutional features to their spatial locations. Experiments are performed on two challenging datasets: JHMDB dataset with 21 action classes and ACT dataset with 43 categories. The results clearly demonstrate that our approach outperforms both standard approaches of early and late feature fusion. Further, our approach is only employing action labels and without exploiting body part information, but achieves competitive performance compared to the state-of-the-art deep features based approaches.
引用
收藏
页码:297 / 309
页数:13
相关论文
共 50 条
  • [41] Neurodynamical Top-Down Processing during Auditory Attention
    Balaguer-Ballester, Emili
    Bouchachia, Abdelhamid
    Jiang, Beibei
    Denham, Susan L.
    NEURAL INFORMATION PROCESSING, ICONIP 2012, PT II, 2012, 7664 : 266 - 273
  • [42] Bottom-up and top-down attention are independent
    Pinto, Yair
    van der Leij, Andries R.
    Sligte, Ilja G.
    Lamme, Victor A. F.
    Scholte, H. Steven
    JOURNAL OF VISION, 2013, 13 (03): : 16
  • [43] A lateralized top-down network for visuospatial attention and neglect
    Jiaojian Wang
    Yanghua Tian
    Mengzhu Wang
    Long Cao
    Huawang Wu
    Yun Zhang
    Kai Wang
    Tianzi Jiang
    Brain Imaging and Behavior, 2016, 10 : 1029 - 1037
  • [44] Top-Down Influences of Spatial Attention in Visual Cortex
    Bouvier, Seth E.
    JOURNAL OF NEUROSCIENCE, 2009, 29 (06): : 1597 - 1598
  • [45] Top-down Causation Without Top-down Causes
    Carl F. Craver
    William Bechtel
    Biology & Philosophy, 2007, 22 : 547 - 563
  • [46] Top-down selective visual attention: A neurodynamical approach
    Deco, G
    Zihl, J
    VISUAL COGNITION, 2001, 8 (01) : 119 - 140
  • [47] Independent effects of statistical learning and top-down attention
    Gao, Ya
    Theeuwes, Jan
    ATTENTION PERCEPTION & PSYCHOPHYSICS, 2020, 82 (08) : 3895 - 3906
  • [48] Unconscious Perceptual Grouping Modulated by Top-Down Attention
    Lo, Shih-Yu
    I-PERCEPTION, 2017, 8 : 86 - 86
  • [49] A lateralized top-down network for visuospatial attention and neglect
    Wang, Jiaojian
    Tian, Yanghua
    Wang, Mengzhu
    Cao, Long
    Wu, Huawang
    Zhang, Yun
    Wang, Kai
    Jiang, Tianzi
    BRAIN IMAGING AND BEHAVIOR, 2016, 10 (04) : 1029 - 1037
  • [50] TOP-DOWN AND BOTTOM-UP CONTROL OF ATTENTION
    JOHNSTON, WA
    MALLOY, TE
    FOSS, CL
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1979, 14 (04) : 240 - 240