Recognition of visual activities and interactions by stochastic parsing

被引:370
|
作者
Ivanov, YA
Bobick, AF
机构
[1] MIT, Media Lab, Vis & Modeling Grp, Cambridge, MA 02139 USA
[2] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
关键词
syntactic pattern recognition; action recognition; high level vision; video surveillance; gesture recognition; video monitoring;
D O I
10.1109/34.868686
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-fever detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. To achieve such a system we: 1) provide techniques for generating a discrete symbol stream from continuous low-level detectors; 2) extend stochastic context-free parsing to handle uncertainty in the input symbol stream; 3) augment a run-time parsing algorithm to enforce intersymbol constraints such as requiring temporal consistency between primitives; and 4) extend the consistency filtering to maintain consistent multiobject interactions. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple, interacting objects.
引用
收藏
页码:852 / 872
页数:21
相关论文
共 50 条
  • [1] Stochastic parsing and parallelism
    Barcala, FM
    Sacristán, O
    Graña, J
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2001, 2004 : 401 - 410
  • [2] INHIBITORY INTERACTIONS IN VISUAL RECOGNITION OF IDENTITY
    BJORK, EL
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1977, 10 (04) : 244 - 244
  • [3] Video events recognition by improved stochastic parsing based on extended stochastic context-free grammar representation
    曹茂永
    赵猛
    裴明涛
    赵增顺
    JournalofBeijingInstituteofTechnology, 2013, 22 (01) : 81 - 88
  • [4] Modelling of interactions for the recognition of activities in groups of people
    Stephens, Kyle
    Bors, Adrian G.
    DIGITAL SIGNAL PROCESSING, 2018, 79 : 34 - 46
  • [5] The Evolution of Stochastic Grammars for Representation and Recognition of Activities in Videos
    Chellappa, Rama
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [6] VISUAL WORD RECOGNITION - INTERACTIONS WITH THE SENSORY SURFACE
    BESNER, D
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1986, 24 (05) : 338 - 338
  • [7] STOCHASTIC PARSING AND EVOLUTIONARY ALGORITHMS
    Araujo, Lourdes
    APPLIED ARTIFICIAL INTELLIGENCE, 2009, 23 (04) : 346 - 372
  • [8] View dependencies in the visual recognition of social interactions
    de La Rosa, S.
    Miekes, S.
    Buelthoff, H.
    Curio, C.
    PERCEPTION, 2012, 41 : 240 - 240
  • [9] View dependencies in the visual recognition of social interactions
    de la Rosa, Stephan
    Mieskes, Sarah
    Buelthoff, Heinrich H.
    Curio, Cristobal
    FRONTIERS IN PSYCHOLOGY, 2013, 4
  • [10] Petroglyph Recognition using Self-Organizing Maps and Fuzzy Visual Language Parsing
    Deufemia, Vincenzo
    Paolino, Luca
    de Lumley, Henry
    2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 852 - 859