A multi-modal system for the retrieval of semantic video events

被引:13
|
作者
Amir, A
Basu, S
Iyengar, G
Lin, CY
Naphade, M
Smith, JR
Srinivasan, S
Tseng, B
机构
[1] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
[2] IBM TJ Watson Res Ctr, Hawthorne, NY 10532 USA
[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
multimedia indexing; event detection; semantic video annotation; content-based video retrieval;
D O I
10.1016/j.cviu.2004.02.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A framework for event detection is proposed where events, objects, and other semantic concepts are detected from video using trained classifiers. These classifiers are used to automatically annotate video with semantic labels, which in turn are used to search for new, untrained types of events and semantic concepts. The novelty of the approach lies in the: (1) semi-automatic construction of models of events from feature descriptors and (2) integration of content-based and concept-based querying in the search process. Speech retrieval is independently applied and combined results are produced. Results of applying these to the Search benchmark of the NIST TREC Video track 2001 are reported, and the gained experience and future work are discussed. (C) 2004 Published by Elsevier Inc.
引用
收藏
页码:216 / 236
页数:21
相关论文
共 50 条
  • [21] A probabilistic semantic model for image annotation and multi-modal image retrieval
    Zhang, RF
    Zhang, ZF
    Li, MJ
    Ma, WY
    Zhang, HJ
    TENTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 846 - 851
  • [22] Flexible Dual Multi-Modal Hashing for Incomplete Multi-Modal Retrieval
    Wei, Yuhong
    An, Junfeng
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024,
  • [23] Fashion Focus: Multi-modal Retrieval System for Video Commodity Localization in E-commerce
    Zhang, Yanhao
    Wang, Qiang
    Pan, Pan
    Zheng, Yun
    Da, Cheng
    Sun, Siyang
    Xu, Yinghui
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16127 - 16128
  • [24] Multi-modal Video Retrieval in Virtual Reality with vitrivr-VR
    Spiess, Florian
    Gasser, Ralph
    Heller, Silvan
    Parian-Scherb, Mahnaz
    Rossetto, Luca
    Sauter, Loris
    Schuldt, Heiko
    MULTIMEDIA MODELING, MMM 2022, PT II, 2022, 13142 : 499 - 504
  • [25] Text-Video Retrieval via Multi-Modal Hypergraph Networks
    Li, Qian
    Su, Lixin
    Zhao, Jiashu
    Xia, Long
    Cai, Hengyi
    Cheng, Suqi
    Tang, Hengzhu
    Wang, Junfeng
    Yin, Dawei
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 369 - 377
  • [26] An Interactive Video Search Platform for Multi-modal Retrieval with Advanced Concepts
    Nguyen-Khang Le
    Dieu-Hien Nguyen
    Minh-Triet Tran
    MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 766 - 771
  • [27] EISNet: A Multi-Modal Fusion Network for Semantic Segmentation With Events and Images
    Xie, Bochen
    Deng, Yongjian
    Shao, Zhanpeng
    Li, Youfu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8639 - 8650
  • [28] Towards Developing a Multi-Modal Video Recommendation System
    Pingali, Sriram
    Mondal, Prabir
    Chakder, Daipayan
    Saha, Sriparna
    Ghosh, Angshuman
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [29] Multi-modal Video Summarization
    Huang, Jia-Hong
    ICMR 2024 - Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024, : 1214 - 1218
  • [30] Multi-modal Video Summarization
    Huang, Jia-Hong
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1214 - 1218