A bag-of-words equivalent recurrent neural network for action recognition

被引：33

作者：

Richard, Alexander ^{[1
]}

Gall, Juergen ^{[1
]}

机构：

[1] Univ Bonn, Romerstrasse 164, D-53177 Bonn, Germany

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2017年 / 156卷

基金：

欧洲研究理事会;

关键词：

Action recognition; Bag-of-words; Neural networks; DESCRIPTORS; CATEGORIES;

D O I：

10.1016/j.cviu.2016.10.014

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The traditional bag-of-words approach has found a wide range of applications in computer vision. The standard pipeline consists of a generation of a visual vocabulary, a quantization of the features into histograms of visual words, and a classification step for which usually a support vector machine in combination with a non-linear kernel is used. Given large amounts of data, however, the model suffers from a lack of discriminative power. This applies particularly for action recognition, where the vast amount of video features needs to be subsampled for unsupervised visual vocabulary generation. Moreover, the kernel computation can be very expensive on large datasets. In this work, we propose a recurrent neural network that is equivalent to the traditional bag-of-words approach but enables for the application of discriminative training. The model further allows to incorporate the kernel computation into the neural network directly, solving the complexity issue and allowing to represent the complete classification system within a single network. We evaluate our method on four recent action recognition benchmarks and show that the conventional model as well as sparse coding methods are outperformed. (C) 2016 Elsevier Inc. All rights reserved.

引用

页码：79 / 91

页数：13

共 50 条

[1] Scale Coding Bag-of-Words for Action Recognition
Khan, Fahad Shahbaz
van de Weijer, Joost
Bagdanov, Andrew D.
Felsberg, Michael
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 1514 - 1519
[2] Recognition of Traffic Sign Based on Bag-of-Words and Artificial Neural Network
Islam, Kh Tohidul
Raj, Ram Gopal
Mujtaba, Ghulam
SYMMETRY-BASEL, 2017, 9 (08):
[3] Bag-of-words Modelling for Speech Recognition
Ziolko, Bartosz
Manandhar, Suresh
Wilson, Richard C.
INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 646 - +
[4] Bag-of-Words Based Deep Neural Network for Image Retrieval
Bai, Yalong
Yu, Wei
Xiao, Tianjun
Xu, Chang
Yang, Kuiyuan
Ma, Wei-Ying
Zhao, Tiejun
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 229 - 232
[5] Vehicle Logo Recognition Based on Bag-of-Words
Yu, Shuyuan
Zheng, Shibao
Yang, Hua
Liang, Longfei
2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 353 - 358
[6] Bag-of-Words as Target for Neural Machine Translation
Ma, Shuming
Sun, Xu
Wang, Yizhong
Lin, Junyang
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 332 - 338
[7] Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition
Sheikh, Imran
Illina, Irina
Fohr, Dominique
Linares, Georges
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 675 - 679
[8] Recurrent Neural Network Language Model With Incremental Updated Context Information Generated Using Bag-of-Words Representation
Haidar, Md. Akmal
Kurimo, Mikko
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3504 - 3508
[9] Bag-of-Words Input for Long History Representation in Neural Network-based Language Models for Speech Recognition
Irie, Kazuki
Schlueter, Ralf
Ney, Hermann
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2371 - 2375
[10] Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition
Yang Li
Junyong Ye
Tongqing Wang
Shijian Huang
The Visual Computer, 2015, 31 : 1383 - 1394

← 1 2 3 4 5 →