A multi-modal system for the retrieval of semantic video events

被引：13

作者：

Amir, A

Basu, S

Iyengar, G

Lin, CY

Naphade, M

Smith, JR

Srinivasan, S

Tseng, B

机构：

[1] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA

[2] IBM TJ Watson Res Ctr, Hawthorne, NY 10532 USA

[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2004年 / 96卷 / 02期

关键词：

multimedia indexing; event detection; semantic video annotation; content-based video retrieval;

D O I：

10.1016/j.cviu.2004.02.006

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A framework for event detection is proposed where events, objects, and other semantic concepts are detected from video using trained classifiers. These classifiers are used to automatically annotate video with semantic labels, which in turn are used to search for new, untrained types of events and semantic concepts. The novelty of the approach lies in the: (1) semi-automatic construction of models of events from feature descriptors and (2) integration of content-based and concept-based querying in the search process. Speech retrieval is independently applied and combined results are produced. Results of applying these to the Search benchmark of the NIST TREC Video track 2001 are reported, and the gained experience and future work are discussed. (C) 2004 Published by Elsevier Inc.

引用

页码：216 / 236

页数：21

共 50 条

[21] A probabilistic semantic model for image annotation and multi-modal image retrieval
Zhang, RF
Zhang, ZF
Li, MJ
Ma, WY
Zhang, HJ
TENTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 846 - 851
[22] Flexible Dual Multi-Modal Hashing for Incomplete Multi-Modal Retrieval
Wei, Yuhong
An, Junfeng
INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024,
[23] Fashion Focus: Multi-modal Retrieval System for Video Commodity Localization in E-commerce
Zhang, Yanhao
Wang, Qiang
Pan, Pan
Zheng, Yun
Da, Cheng
Sun, Siyang
Xu, Yinghui
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16127 - 16128
[24] Multi-modal Video Retrieval in Virtual Reality with vitrivr-VR
Spiess, Florian
Gasser, Ralph
Heller, Silvan
Parian-Scherb, Mahnaz
Rossetto, Luca
Sauter, Loris
Schuldt, Heiko
MULTIMEDIA MODELING, MMM 2022, PT II, 2022, 13142 : 499 - 504
[25] Text-Video Retrieval via Multi-Modal Hypergraph Networks
Li, Qian
Su, Lixin
Zhao, Jiashu
Xia, Long
Cai, Hengyi
Cheng, Suqi
Tang, Hengzhu
Wang, Junfeng
Yin, Dawei
PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 369 - 377
[26] An Interactive Video Search Platform for Multi-modal Retrieval with Advanced Concepts
Nguyen-Khang Le
Dieu-Hien Nguyen
Minh-Triet Tran
MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 766 - 771
[27] EISNet: A Multi-Modal Fusion Network for Semantic Segmentation With Events and Images
Xie, Bochen
Deng, Yongjian
Shao, Zhanpeng
Li, Youfu
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8639 - 8650
[28] Towards Developing a Multi-Modal Video Recommendation System
Pingali, Sriram
Mondal, Prabir
Chakder, Daipayan
Saha, Sriparna
Ghosh, Angshuman
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[29] Multi-modal Video Summarization
Huang, Jia-Hong
ICMR 2024 - Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024, : 1214 - 1218
[30] Multi-modal Video Summarization
Huang, Jia-Hong
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1214 - 1218

← 1 2 3 4 5 →