A factor graph framework for semantic video indexing

被引:0
|
作者
Naphade, MR [1 ]
Kozintsev, IV
Huang, TS
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Hawthorne, NY 10532 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Beckman Inst Adv Sci & Technol, Urbana, IL 61801 USA
关键词
factor graphs; hidden Markov models; likelihood ratio test; multimedia understanding; probabilistic graphical networks; probability propagation; query by example; query by keywords; ROC curves; semantic video indexing; sum-product algorithm;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video query by semantic keywords is one of the most challenging research issues in video data management. To go beyond low-level similarity and access video data content by semantics, we need to bridge the gap between the low-level representation and high-level semantics. This is a difficult multimedia understanding problem. We formulate this problem as a probabilistic pattern-recognition problem for modeling semantics in terms of concepts and context. To map low-level features to high-level semantics, we propose probabilistic multimedia objects (multijects). Examples of multijects in movies include explosion, mountain, beach, outdoor, music, etc. Semantic concepts in videos interact and appear in context. To model this interaction explicitly, we propose a network of multijects (multinet). To model the multinet computationally, we propose a factor graph framework which can enforce spatio-temporal constraints. Using probabilistic models for multijects, rocks, sky, snow, water-body, and forestry/greenery, and using a factor graph as the multinet, we demonstrate the application of this framework to semantic video indexing. We demonstrate how detection performance can be significantly improved using the multinet to take inter-conceptual relationships into account. Our experiments using a large video database consisting of clips from several movies and based on a set of five semantic concepts reveal a significant improvement in detection performance by over 22%. We also show how the multinet is extended to take temporal correlation into account. By constructing a dynamic multinet, we show that the detection performance is further enhanced by as much as 12%. With this framework, we show how keyword-based query and semantic filtering is possible for a predetermined set of concepts.
引用
收藏
页码:40 / 52
页数:13
相关论文
共 50 条
  • [1] A factor graph framework for semantic indexing and retrieval in video
    Kozintsev, MR
    Kozintsev, I
    Huang, TS
    Ramchandran, K
    [J]. IEEE WORKSHOP ON CONTENT-BASED ACCESS OF IMAGE AND VIDEO LIBRARIES, PROCEEDINGS, 2000, : 35 - 39
  • [2] Semantic Video Indexing using a probabilistic framework
    Naphade, MR
    Huang, TS
    [J]. 15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 79 - 84
  • [3] A probabilistic framework for semantic indexing and retrieval in video
    Naphade, MR
    Huang, TS
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 475 - 478
  • [4] A probabilistic framework for semantic video indexing, filtering, and retrieval
    Naphade, MR
    Huang, TS
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2001, 3 (01) : 141 - 151
  • [5] A Framework for Semantic Video Content Indexing Using Textual Information
    Mansouri, Sadek
    Charhad, Mbarek
    Rekik, Ali
    Zrigui, Mounir
    [J]. 2018 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA STREAM MINING & PROCESSING (DSMP), 2018, : 107 - 110
  • [6] Probabilistic semantic video indexing
    Naphade, MR
    Kozintsev, I
    Huang, T
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 967 - 973
  • [7] A generic framework for semantic video indexing based on visual concepts/contexts detection
    Nizar Elleuch
    Anis Ben Ammar
    Adel M. Alimi
    [J]. Multimedia Tools and Applications, 2015, 74 : 1397 - 1421
  • [8] A generic framework for semantic video indexing based on visual concepts/contexts detection
    Elleuch, Nizar
    Ben Ammar, Anis
    Alimi, Adel M.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (04) : 1397 - 1421
  • [9] MOTION DESCRIPTORS FOR SEMANTIC VIDEO INDEXING
    Zampoglou, Markos
    Papadimitriou, Theophilos
    Diamantaras, Konstantinos I.
    [J]. SIGMAP 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATION, 2010, : 178 - 184
  • [10] Semantic structures for video data indexing
    Zettsu, K
    Uehara, K
    Tanaka, K
    [J]. ADVANCED MULTIMEDIA CONTENT PROCESSING, 1999, 1554 : 356 - 369