Recognition of visual activities and interactions by stochastic parsing

被引:370
|
作者
Ivanov, YA
Bobick, AF
机构
[1] MIT, Media Lab, Vis & Modeling Grp, Cambridge, MA 02139 USA
[2] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
关键词
syntactic pattern recognition; action recognition; high level vision; video surveillance; gesture recognition; video monitoring;
D O I
10.1109/34.868686
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-fever detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. To achieve such a system we: 1) provide techniques for generating a discrete symbol stream from continuous low-level detectors; 2) extend stochastic context-free parsing to handle uncertainty in the input symbol stream; 3) augment a run-time parsing algorithm to enforce intersymbol constraints such as requiring temporal consistency between primitives; and 4) extend the consistency filtering to maintain consistent multiobject interactions. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple, interacting objects.
引用
收藏
页码:852 / 872
页数:21
相关论文
共 50 条
  • [41] Hierarchical Dynamic Parsing and Encoding for Action Recognition
    Su, Bing
    Zhou, Jiahuan
    Ding, Xiaoqing
    Wang, Hao
    Wu, Ying
    COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 202 - 217
  • [42] Object recognition and image parsing of natural images
    Jeurissen, D. J. J. D. M.
    Korjoukov, I.
    Kloosterman, N.
    Scholte, H. S.
    Roelfsema, P. R.
    PERCEPTION, 2011, 40 : 94 - 95
  • [43] Progressively diffused networks for semantic visual parsing
    Zhang, Ruimao
    Yang, Wei
    Peng, Zhanglin
    Wei, Pengxu
    Wang, Xiaogang
    Lin, Liang
    PATTERN RECOGNITION, 2019, 90 : 78 - 86
  • [44] Stochastic Process Underlying Emergent Recognition of Visual Objects Hidden in Degraded Images
    Murata, Tsutomu
    Hamada, Takashi
    Shimokawa, Tetsuya
    Tanifuji, Manabu
    Yanagida, Toshio
    PLOS ONE, 2014, 9 (12):
  • [45] A visual model approach for parsing colonoscopy videos
    Cao, Y
    Tavanapong, W
    Li, DL
    Oh, J
    de Groen, PC
    Wong, J
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2004, 3115 : 160 - 169
  • [46] Visual Parsing After Recovery From Blindness
    Ostrovsky, Yuri
    Meyers, Ethan
    Ganesh, Suma
    Mathur, Umang
    Sinha, Pawan
    PSYCHOLOGICAL SCIENCE, 2009, 20 (12) : 1484 - 1491
  • [47] Design pattern recovery by visual language parsing
    Costagliola, G
    De Lucia, A
    Deufemia, V
    Gravino, C
    Risi, M
    NINTH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING, PROCEEDINGS, 2005, : 102 - 111
  • [48] On Parsing Visual Sequences with the Hidden Markov Model
    Harte, Naomi
    Lennon, Daire
    Kokaram, Anil
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2009,
  • [49] Practical error handling in parsing visual languages
    Tuovinen, AP
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2000, 11 (05): : 505 - 528
  • [50] On Parsing Visual Sequences with the Hidden Markov Model
    Naomi Harte
    Daire Lennon
    Anil Kokaram
    EURASIP Journal on Image and Video Processing, 2009