SUM-MAX VIDEO POOLING FOR COMPLEX EVENT RECOGNITION

被引：0

作者：

Phan, Sang ^{[1
,2
]}

Duy-Dinh Le ^{[2
,3
]}

Satoh, Shin'ichi ^{[2
]}

机构：

[1] Grad Univ Adv Studies SOKENDAI, Hayama, Japan

[2] Natl Inst Informat, Tokyo, Japan

[3] Univ Informat Technol, Multimedia Commun Lab, Ho Chi Minh, Vietnam

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2014年

关键词：

video representation; sum-pooling; max-pooling; sum-max video pooling; multimedia event detection; FEATURES;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

A video can be viewed as a layered structure where the lowest layer are frames, the top layer is the entire video, and the middle layers are the sequences of consecutive frames or the concatenation of lower layers. While it is easy to find local discriminative features in video from lower layers, it is non-trivial to aggregate these features into a discriminative video representation. In literature, people often use sum pooling to obtain reasonable recognition performance on artificial videos. However, the sum pooling technique does not work well on complex videos because the region of interests may reside within some middle layers. In this paper, we leverage the layered structure of video to propose a new pooling method, named sum-max video pooling, to handle this problem. Basically, we apply sum pooling at the low layer representation while using max pooling at the high layer representation. Sum pooling is used to keep sufficient relevant features at the low layer, while max pooling is used to retrieve the most relevant features at the high layer, therefore it can discard irrelevant features in the final video representation. Experimental results on the TRECVID Multimedia Event Detection 2010 dataset shows the effectiveness of our method.

引用

页码：1026 / 1030

页数：5

共 50 条

[41] MCMP-Net: MLP combining max pooling network for sEMG gesture recognition
Mian, Xiang
Zhou, Bingtao
Cheng, Shiqiang
Song, Liu
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 90
[42] Hierarchical Event Representation and Recognition Method for Scalable Video Event Analysis
Kwak, Suha
Han, Joon Hee
[J]. ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 586 - 591
[43] CORE: a Complex Event Recognition Engine
Bucchi, Marco
Grez, Alejandro
Quintana, Andres
Riveros, Cristian
Vansummeren, Stijn
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (09): : 1951 - 1964
[44] COMPLEX EVENT RECOGNITION WITH UNCERTAINTY REASONING
Liu, Xueqin
Clawson, Kathy
Wang, Hui
Scotney, Bryan
Liu, Jun
[J]. PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 1823 - 1828
[45] Probabilistic Complex Event Recognition: A Survey
Alevizos, Elias
Skarlatidis, Anastasios
Artikis, Alexander
Paliouras, Georgios
[J]. ACM COMPUTING SURVEYS, 2017, 50 (05)
[46] A Formal Framework for Complex Event Recognition
Grez, Alejandro
Riveros, Cristian
Ugarte, Martin
Vansummeren, Stijn
[J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2021, 46 (04):
[47] Distribution and Uncertainty in Complex Event Recognition
Artikis, Alexander
Weidlich, Matthias
[J]. RULE TECHNOLOGIES: FOUNDATIONS, TOOLS, AND APPLICATIONS, 2015, 9202 : 70 - 80
[48] GENERALIZED POOLING PYRAMID WITH HIERARCHICAL DICTIONARY SPARSE CODING FOR EVENT AND OBJECT RECOGNITION
Chen, Shuai
Ma, Bo
Luo, Pei
[J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 2349 - 2353
[49] Comparison of temporal pooling methods for estimating the quality of complex video sequences
Rohaly, AM
Lu, JH
Franzen, NR
Ravel, MK
[J]. HUMAN VISION AND ELECTRONIC IMAGING IV, 1999, 3644 : 218 - 225
[50] Video understanding for complex activity recognition
Florent Fusier
Valéry Valentin
François Brémond
Monique Thonnat
Mark Borg
David Thirde
James Ferryman
[J]. Machine Vision and Applications, 2007, 18 : 167 - 188

← 1 2 3 4 5 →