Scale Coding Bag-of-Words for Action Recognition

被引:6
|
作者
Khan, Fahad Shahbaz [1 ]
van de Weijer, Joost [2 ]
Bagdanov, Andrew D. [2 ]
Felsberg, Michael [1 ]
机构
[1] Linkoping Univ, Comp Vis Lab, S-58183 Linkoping, Sweden
[2] Univ Autonoma Barcelona, CS Dept, Comp Vis Ctr, E-08193 Barcelona, Spain
关键词
OBJECT; CLASSIFICATION; FEATURES;
D O I
10.1109/ICPR.2014.269
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image. Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant strategy is sub-optimal since it ignores the multi-scale information available with each bounding box of a person. This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music, riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.
引用
收藏
页码:1514 / 1519
页数:6
相关论文
共 50 条
  • [1] A bag-of-words equivalent recurrent neural network for action recognition
    Richard, Alexander
    Gall, Juergen
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 156 : 79 - 91
  • [2] Bag-of-words Modelling for Speech Recognition
    Ziolko, Bartosz
    Manandhar, Suresh
    Wilson, Richard C.
    [J]. INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 646 - +
  • [3] Vehicle Logo Recognition Based on Bag-of-Words
    Yu, Shuyuan
    Zheng, Shibao
    Yang, Hua
    Liang, Longfei
    [J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 353 - 358
  • [4] Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition
    Li, Yang
    Ye, Junyong
    Wang, Tongqing
    Huang, Shijian
    [J]. VISUAL COMPUTER, 2015, 31 (10): : 1383 - 1394
  • [5] Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition
    Yang Li
    Junyong Ye
    Tongqing Wang
    Shijian Huang
    [J]. The Visual Computer, 2015, 31 : 1383 - 1394
  • [6] Sequential Bag-of-Words model for human action classification
    Liu, Hong
    Tang, Hao
    Xiao, Wei
    Guo, ZiYi
    Tian, Lu
    Gao, Yuan
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2016, 1 (02) : 125 - 136
  • [7] Spatio-Temporal Scale Coded Bag-of-Words
    Govender, Divina
    Tapamo, Jules-Raymond
    [J]. SENSORS, 2020, 20 (21) : 1 - 25
  • [8] Exploiting Visual Saliency and Bag-of-Words for Road Sign Recognition
    Xu, Dan
    Xu, Wei
    Tang, Zhenmin
    Liu, Fan
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (09): : 2473 - 2482
  • [9] Research on motion recognition algorithm based on bag-of-words model
    Huang, Ting
    Ru, Sheng-Rong
    Zeng, Zhi-Hong
    Zhang, Long
    [J]. MICROSYSTEM TECHNOLOGIES-MICRO-AND NANOSYSTEMS-INFORMATION STORAGE AND PROCESSING SYSTEMS, 2021, 27 (04): : 1647 - 1654
  • [10] Improved Action Recognition by Combining Multiple 2D Views in the Bag-of-Words Model
    Burghouts, Gertjan
    Eendebak, Pieter
    Bouma, Henri
    ten Hove, Johan-Martijn
    [J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 250 - 255