Spatially Coherent Interpretations of Videos Using Pattern Theory

被引:0
|
作者
Fillipe D. M. de Souza
Sudeep Sarkar
Anuj Srivastava
Jingyong Su
机构
[1] University of South Florida,Department of Computer Science & Engineering
[2] Florida State University,Department of Statistics
[3] Texas Tech University,Department of Mathematics & Statistics
来源
关键词
Activity detection; Pattern theory; Graphical methods; Compositional approach;
D O I
暂无
中图分类号
学科分类号
摘要
Activity interpretation in videos results not only in recognition or labeling of dominant activities, but also in semantic descriptions of scenes. Towards this broader goal, we present a combinatorial approach that assumes availability of algorithms for detecting and labeling objects and basic actions in videos, albeit with some errors. Given these uncertain labels and detected objects, we link them into interpretable structures using the domain knowledge, under the framework of Grenander’s general pattern theory. Here a semantic description is built using basic units, termed generators, that represent either objects or actions. These generators have multiple out-bonds, each associated with different types of domain semantics, spatial constraints, and image evidence. The generators combine, according to a set of pre-defined combination rules that capture domain semantics, to form larger configurations that represent video interpretations. This framework derives its representational power from flexibility in size and structure of configurations. We impose a probability distribution on the configuration space, with inferences generated using a Markov chain Monte Carlo-based simulated annealing process. The primary advantage of the approach is that it handles known challenges—appearance variabilities, errors in object labels, object clutter, simultaneous events, etc—without the need for exponentially-large (labeled) training data. Experimental results demonstrate its ability to successfully provide interpretations under clutter and the simultaneity of events. They show: (1) a performance increase of more than 30 % over other state-of-the-art approaches using more than 5000 video units from the Breakfast Actions dataset, and (2) an overall recall and precision improvement of more than 50 and 100 %, respectively, on the YouCook data set.
引用
收藏
页码:5 / 25
页数:20
相关论文
共 50 条
  • [1] Spatially Coherent Interpretations of Videos Using Pattern Theory
    de Souza, Fillipe D. M.
    Sarkar, Sudeep
    Srivastava, Anuj
    Su, Jingyong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 121 (01) : 5 - 25
  • [2] Temporally Coherent Interpretations for Long Videos Using Pattern Theory
    Souza, Fillipe
    Sarkar, Sudeep
    Srivastava, Anuj
    Su, Jingyong
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 1229 - 1237
  • [3] Theory of spatially and spectrally partially coherent pulses
    Lajunen, H
    Vahimaa, P
    Tervo, J
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2005, 22 (08) : 1536 - 1545
  • [4] Learning continuous temporal emb e dding of videos using pattern theory
    Xie, Zhao
    Wu, Kewei
    Zhang, Xiaoyu
    Yang, Xingming
    Hou, Jinkui
    PATTERN RECOGNITION LETTERS, 2021, 146 : 222 - 229
  • [5] Common Visual Pattern Discovery via Spatially Coherent Correspondences
    Liu, Hairong
    Yan, Shuicheng
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 1609 - 1616
  • [6] THE THEORY OF THE FLYING GEESE PATTERN OF DEVELOPMENT AND ITS INTERPRETATIONS
    KORHONEN, P
    JOURNAL OF PEACE RESEARCH, 1994, 31 (01) : 93 - 108
  • [7] Pattern theory for representation and inference of semantic structures in videos
    de Souza, Fillipe D. M.
    Sarkar, Sudeep
    Srivastava, Anuj
    Su, Jingyong
    PATTERN RECOGNITION LETTERS, 2016, 72 : 41 - 51
  • [8] Spatially coherent clustering using graph cuts
    Zabih, R
    Kolmogorov, V
    PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, 2004, : 437 - 444
  • [9] A coherent theory for the copyright protection of computer software and recent judicial interpretations
    Karjala, DS
    UNIVERSITY OF CINCINNATI LAW REVIEW, 1997, 66 (01) : 53 - 117
  • [10] PREADOLESCENT PERCEPTIONS AND INTERPRETATIONS OF MUSIC VIDEOS
    CHRISTENSON, P
    POPULAR MUSIC AND SOCIETY, 1992, 16 (03) : 63 - 73