A multi-modal system for the retrieval of semantic video events

被引:12
|
作者
Amir, A
Basu, S
Iyengar, G
Lin, CY
Naphade, M
Smith, JR
Srinivasan, S
Tseng, B
机构
[1] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
[2] IBM TJ Watson Res Ctr, Hawthorne, NY 10532 USA
[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
multimedia indexing; event detection; semantic video annotation; content-based video retrieval;
D O I
10.1016/j.cviu.2004.02.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A framework for event detection is proposed where events, objects, and other semantic concepts are detected from video using trained classifiers. These classifiers are used to automatically annotate video with semantic labels, which in turn are used to search for new, untrained types of events and semantic concepts. The novelty of the approach lies in the: (1) semi-automatic construction of models of events from feature descriptors and (2) integration of content-based and concept-based querying in the search process. Speech retrieval is independently applied and combined results are produced. Results of applying these to the Search benchmark of the NIST TREC Video track 2001 are reported, and the gained experience and future work are discussed. (C) 2004 Published by Elsevier Inc.
引用
收藏
页码:216 / 236
页数:21
相关论文
共 50 条
  • [1] A Multi-modal System for Video Semantic Understanding
    Lv, Zhengwei
    Lei, Tao
    Liang, Xiao
    Shi, Zhizhong
    Liu, Duoxing
    [J]. CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 34 - 43
  • [2] Multi-modal semantic autoencoder for cross-modal retrieval
    Wu, Yiling
    Wang, Shuhui
    Huang, Qingming
    [J]. NEUROCOMPUTING, 2019, 331 : 165 - 175
  • [3] Multi-modal information retrieval with a semantic view mechanism
    Li, Q
    Yang, J
    Zhuang, YT
    [J]. 19TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 1, PROCEEDINGS: AINA 2005, 2005, : 133 - 138
  • [4] Deep Multi-Modal Hashing With Semantic Enhancement for Multi-Label Micro-Video Retrieval
    Jing, Peiguang
    Sun, Haoyi
    Nie, Liqiang
    Li, Yun
    Su, Yuting
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (10) : 5080 - 5091
  • [5] Multi-modal Language Models for Lecture Video Retrieval
    Chen, Huizhong
    Cooper, Matthew
    Joshi, Dhiraj
    Girod, Bernd
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 1081 - 1084
  • [6] Multi-modal Interactive Video Retrieval with Temporal Queries
    Heller, Silvan
    Arnold, Rahel
    Gasser, Ralph
    Gsteiger, Viktor
    Parian-Scherb, Mahnaz
    Rossetto, Luca
    Sauter, Loris
    Spiess, Florian
    Schuldt, Heiko
    [J]. MULTIMEDIA MODELING, MMM 2022, PT II, 2022, 13142 : 493 - 498
  • [7] Personalized Multi-modal Video Retrieval on Mobile Devices
    Zhang, Haotian
    Jepson, Allan D.
    Mohomed, Iqbal
    Derpanis, Konstantinos G.
    Zhang, Ran
    Fazly, Afsaneh
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1185 - 1191
  • [8] An Intelligent Advertisement Short Video Production System via Multi-Modal Retrieval
    Wei, Yanheng
    Huang, Lianghua
    Zhang, Yanhao
    Zheng, Yun
    Pan, Pan
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3368 - 3372
  • [9] Multi-modal features and correlation incorporated Naive Bayes classifier for a semantic-enriched lecture video retrieval system
    Poornima, N.
    Saleena, B.
    [J]. IMAGING SCIENCE JOURNAL, 2018, 66 (05): : 263 - 277
  • [10] Fuzzy Ontology Based Multi-Modal Semantic Information Retrieval
    Nagarajan, G.
    Minu, R. I.
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015), 2015, 48 : 101 - 106