A factor graph framework for semantic video indexing

被引:0
|
作者
Naphade, MR [1 ]
Kozintsev, IV
Huang, TS
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Hawthorne, NY 10532 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Beckman Inst Adv Sci & Technol, Urbana, IL 61801 USA
关键词
factor graphs; hidden Markov models; likelihood ratio test; multimedia understanding; probabilistic graphical networks; probability propagation; query by example; query by keywords; ROC curves; semantic video indexing; sum-product algorithm;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video query by semantic keywords is one of the most challenging research issues in video data management. To go beyond low-level similarity and access video data content by semantics, we need to bridge the gap between the low-level representation and high-level semantics. This is a difficult multimedia understanding problem. We formulate this problem as a probabilistic pattern-recognition problem for modeling semantics in terms of concepts and context. To map low-level features to high-level semantics, we propose probabilistic multimedia objects (multijects). Examples of multijects in movies include explosion, mountain, beach, outdoor, music, etc. Semantic concepts in videos interact and appear in context. To model this interaction explicitly, we propose a network of multijects (multinet). To model the multinet computationally, we propose a factor graph framework which can enforce spatio-temporal constraints. Using probabilistic models for multijects, rocks, sky, snow, water-body, and forestry/greenery, and using a factor graph as the multinet, we demonstrate the application of this framework to semantic video indexing. We demonstrate how detection performance can be significantly improved using the multinet to take inter-conceptual relationships into account. Our experiments using a large video database consisting of clips from several movies and based on a set of five semantic concepts reveal a significant improvement in detection performance by over 22%. We also show how the multinet is extended to take temporal correlation into account. By constructing a dynamic multinet, we show that the detection performance is further enhanced by as much as 12%. With this framework, we show how keyword-based query and semantic filtering is possible for a predetermined set of concepts.
引用
收藏
页码:40 / 52
页数:13
相关论文
共 50 条
  • [21] n-gram Models for Video Semantic Indexing
    Inoue, Nakamasa
    Shinoda, Koichi
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 777 - 780
  • [22] Maximum a Posteriori Adaptation method for Video Semantic Indexing
    Priyadharssini, B. Andal
    Sivagami, S. Vanitha
    Muneeswaran, K.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN COMPUTING, COMMUNICATION AND NANOTECHNOLOGY (ICE-CCN'13), 2013, : 58 - 61
  • [23] Video indexing through integration of syntactic and semantic features
    Gunsel, B
    Ferman, AM
    Tekalp, AM
    [J]. THIRD IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION - WACV '96, PROCEEDINGS, 1996, : 90 - 95
  • [24] Deep Learning Based Semantic Video Indexing and Retrieval
    Podlesnaya, Anna
    Podlesnyy, Sergey
    [J]. PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 2, 2018, 16 : 359 - 372
  • [25] Efficient Genre-Specific Semantic Video Indexing
    Wu, Jun
    Worring, Marcel
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (02) : 291 - 302
  • [26] Combining hierarchical classifiers with video semantic indexing systems
    Zhou, WS
    Dao, SK
    [J]. ADVANCES IN MUTLIMEDIA INFORMATION PROCESSING - PCM 2001, PROCEEDINGS, 2001, 2195 : 78 - 85
  • [27] An object behavior based indexing framework for video
    Guler, S
    Rizkalla, M
    Vetter, MF
    [J]. MULTIMEDIA SYSTEMS AND APPLICATIONS II, 1999, 3845 : 438 - 445
  • [28] Framework for document retrieval using latent semantic indexing
    Phadnis, Neelam
    Gadge, Jayant
    [J]. International Journal of Computers and Applications, 2014, 94 (14) : 37 - 41
  • [29] A framework for understanding Latent Semantic Indexing (LSI) performance
    Kontostathis, A
    Pottenger, WM
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (01) : 56 - 73
  • [30] VISA: A supervised approach to indexing video lectures with semantic annotations
    Cagliero, Luca
    Canale, Lorenzo
    Farinetti, Laura
    [J]. 2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 226 - 235