Recognition of visual activities and interactions by stochastic parsing

被引：370

作者：

Ivanov, YA

Bobick, AF

机构：

[1] MIT, Media Lab, Vis & Modeling Grp, Cambridge, MA 02139 USA

[2] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2000年 / 22卷 / 08期

关键词：

syntactic pattern recognition; action recognition; high level vision; video surveillance; gesture recognition; video monitoring;

D O I：

10.1109/34.868686

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-fever detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. To achieve such a system we: 1) provide techniques for generating a discrete symbol stream from continuous low-level detectors; 2) extend stochastic context-free parsing to handle uncertainty in the input symbol stream; 3) augment a run-time parsing algorithm to enforce intersymbol constraints such as requiring temporal consistency between primitives; and 4) extend the consistency filtering to maintain consistent multiobject interactions. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple, interacting objects.

引用

页码：852 / 872

页数：21

共 50 条

[41] Hierarchical Dynamic Parsing and Encoding for Action Recognition
Su, Bing
Zhou, Jiahuan
Ding, Xiaoqing
Wang, Hao
Wu, Ying
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 202 - 217
[42] Object recognition and image parsing of natural images
Jeurissen, D. J. J. D. M.
Korjoukov, I.
Kloosterman, N.
Scholte, H. S.
Roelfsema, P. R.
PERCEPTION, 2011, 40 : 94 - 95
[43] Progressively diffused networks for semantic visual parsing
Zhang, Ruimao
Yang, Wei
Peng, Zhanglin
Wei, Pengxu
Wang, Xiaogang
Lin, Liang
PATTERN RECOGNITION, 2019, 90 : 78 - 86
[44] Stochastic Process Underlying Emergent Recognition of Visual Objects Hidden in Degraded Images
Murata, Tsutomu
Hamada, Takashi
Shimokawa, Tetsuya
Tanifuji, Manabu
Yanagida, Toshio
PLOS ONE, 2014, 9 (12):
[45] A visual model approach for parsing colonoscopy videos
Cao, Y
Tavanapong, W
Li, DL
Oh, J
de Groen, PC
Wong, J
IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2004, 3115 : 160 - 169
[46] Visual Parsing After Recovery From Blindness
Ostrovsky, Yuri
Meyers, Ethan
Ganesh, Suma
Mathur, Umang
Sinha, Pawan
PSYCHOLOGICAL SCIENCE, 2009, 20 (12) : 1484 - 1491
[47] Design pattern recovery by visual language parsing
Costagliola, G
De Lucia, A
Deufemia, V
Gravino, C
Risi, M
NINTH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING, PROCEEDINGS, 2005, : 102 - 111
[48] On Parsing Visual Sequences with the Hidden Markov Model
Harte, Naomi
Lennon, Daire
Kokaram, Anil
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2009,
[49] Practical error handling in parsing visual languages
Tuovinen, AP
JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2000, 11 (05): : 505 - 528
[50] On Parsing Visual Sequences with the Hidden Markov Model
Naomi Harte
Daire Lennon
Anil Kokaram
EURASIP Journal on Image and Video Processing, 2009

← 1 2 3 4 5 →