Scale Coding Bag-of-Words for Action Recognition

被引：6

作者：

Khan, Fahad Shahbaz ^{[1
]}

van de Weijer, Joost ^{[2
]}

Bagdanov, Andrew D. ^{[2
]}

Felsberg, Michael ^{[1
]}

机构：

[1] Linkoping Univ, Comp Vis Lab, S-58183 Linkoping, Sweden

[2] Univ Autonoma Barcelona, CS Dept, Comp Vis Ctr, E-08193 Barcelona, Spain

来源：

2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2014年

关键词：

OBJECT; CLASSIFICATION; FEATURES;

D O I：

10.1109/ICPR.2014.269

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image. Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant strategy is sub-optimal since it ignores the multi-scale information available with each bounding box of a person. This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music, riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.

引用

页码：1514 / 1519

页数：6

共 50 条

[1] A bag-of-words equivalent recurrent neural network for action recognition
Richard, Alexander
Gall, Juergen
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 156 : 79 - 91
[2] Bag-of-words Modelling for Speech Recognition
Ziolko, Bartosz
Manandhar, Suresh
Wilson, Richard C.
[J]. INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 646 - +
[3] Vehicle Logo Recognition Based on Bag-of-Words
Yu, Shuyuan
Zheng, Shibao
Yang, Hua
Liang, Longfei
[J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 353 - 358
[4] Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition
Li, Yang
Ye, Junyong
Wang, Tongqing
Huang, Shijian
[J]. VISUAL COMPUTER, 2015, 31 (10): : 1383 - 1394
[5] Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition
Yang Li
Junyong Ye
Tongqing Wang
Shijian Huang
[J]. The Visual Computer, 2015, 31 : 1383 - 1394
[6] Sequential Bag-of-Words model for human action classification
Liu, Hong
Tang, Hao
Xiao, Wei
Guo, ZiYi
Tian, Lu
Gao, Yuan
[J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2016, 1 (02) : 125 - 136
[7] Spatio-Temporal Scale Coded Bag-of-Words
Govender, Divina
Tapamo, Jules-Raymond
[J]. SENSORS, 2020, 20 (21) : 1 - 25
[8] Exploiting Visual Saliency and Bag-of-Words for Road Sign Recognition
Xu, Dan
Xu, Wei
Tang, Zhenmin
Liu, Fan
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (09): : 2473 - 2482
[9] Research on motion recognition algorithm based on bag-of-words model
Huang, Ting
Ru, Sheng-Rong
Zeng, Zhi-Hong
Zhang, Long
[J]. MICROSYSTEM TECHNOLOGIES-MICRO-AND NANOSYSTEMS-INFORMATION STORAGE AND PROCESSING SYSTEMS, 2021, 27 (04): : 1647 - 1654
[10] Improved Action Recognition by Combining Multiple 2D Views in the Bag-of-Words Model
Burghouts, Gertjan
Eendebak, Pieter
Bouma, Henri
ten Hove, Johan-Martijn
[J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 250 - 255

← 1 2 3 4 5 →