Top-Down Deep Appearance Attention for Action Recognition

被引:0
|
作者
Anwer, Rao Muhammad [1 ]
Khan, Fahad Shahbaz [2 ]
de Weijer, Joost van [3 ]
Laaksonen, Jorma [1 ]
机构
[1] Aalto Univ, Sch Sci, Dept Comp Sci, Espoo, Finland
[2] Linkoping Univ, Comp Vis Lab, Linkoping, Sweden
[3] Univ Autonoma Barcelona, Comp Vis Ctr, CS Dept, Barcelona, Spain
来源
IMAGE ANALYSIS, SCIA 2017, PT I | 2017年 / 10269卷
基金
芬兰科学院;
关键词
Action recognition; CNNs; Feature fusion; FEATURES;
D O I
10.1007/978-3-319-59126-1_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing human actions in videos is a challenging problem in computer vision. Recently, convolutional neural network based deep features have shown promising results for action recognition. In this paper, we investigate the problem of fusing deep appearance and motion cues for action recognition. We propose a video representation which combines deep appearance and motion based local convolutional features within the bag-of-deep-features framework. Firstly, dense deep appearance and motion based local convolutional features are extracted from spatial (RGB) and temporal (flow) networks, respectively. Both visual cues are processed in parallel by constructing separate visual vocabularies for appearance and motion. A category-specific appearance map is then learned to modulate the weights of the deep motion features. The proposed representation is discriminative and binds the deep local convolutional features to their spatial locations. Experiments are performed on two challenging datasets: JHMDB dataset with 21 action classes and ACT dataset with 43 categories. The results clearly demonstrate that our approach outperforms both standard approaches of early and late feature fusion. Further, our approach is only employing action labels and without exploiting body part information, but achieves competitive performance compared to the state-of-the-art deep features based approaches.
引用
收藏
页码:297 / 309
页数:13
相关论文
共 50 条
  • [31] Top-down cortical interactions in visuospatial attention
    Meehan, Timothy P.
    Bressler, Steven L.
    Tang, Wei
    Astafiev, Serguei V.
    Sylvester, Chad M.
    Shulman, Gordon L.
    Corbetta, Maurizio
    BRAIN STRUCTURE & FUNCTION, 2017, 222 (07): : 3127 - 3145
  • [32] Top-down control of attention by stereoscopic depth
    Zou, Bochao
    Liu, Yue
    Wolfe, Jeremy M.
    VISION RESEARCH, 2022, 198
  • [33] Top-down cortical interactions in visuospatial attention
    Timothy P. Meehan
    Steven L. Bressler
    Wei Tang
    Serguei V. Astafiev
    Chad M. Sylvester
    Gordon L. Shulman
    Maurizio Corbetta
    Brain Structure and Function, 2017, 222 : 3127 - 3145
  • [34] Top-down attention guided object detection
    Tian, Mei
    Luc, Si-Wei
    Liao, Ling-Zhi
    Zhao, Lian-Wei
    NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2006, 4232 : 193 - 202
  • [35] The limits of top-down control of visual attention
    Van der Stigchel, Stefan
    Belopolsky, Artem V.
    Peters, Judith C.
    Wijnen, Jasper G.
    Meeter, Martijn
    Theeuwes, Jan
    ACTA PSYCHOLOGICA, 2009, 132 (03) : 201 - 212
  • [36] The relationship between top-down attention and consciousness
    Tsuchiya, Naotsugu
    NEUROSCIENCE RESEARCH, 2008, 61 : S32 - S32
  • [37] The effect of top-down attention on empathy fatigue
    Shao, Min
    Li, Lingxiao
    Li, Xiong
    Wei, Zilong
    Wang, Junyao
    Hong, Mingyu
    Liu, Xiaocui
    Meng, Jing
    CEREBRAL CORTEX, 2024, 34 (01)
  • [38] Recognition by top-down and bottom-up processing in cortex: The control of selective attention
    Graboi, D
    Lisman, J
    JOURNAL OF NEUROPHYSIOLOGY, 2003, 90 (02) : 798 - 810
  • [39] The contribution of top-down predictions to visual recognition
    Bar, M
    PERCEPTION, 2005, 34 : 47 - 48
  • [40] PHONEME RECOGNITION BASED ON TOP-DOWN APPROACH
    AIKAWA, K
    SUGIYAMA, M
    SHIKANO, K
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1984, 32 (02): : 200 - 211