Semantic indexing of multimedia content using visual, audio, and text cues

被引：0

作者：

机构：

[1] Adams, W.H.

[2] Iyengar, Giridharan

[3] Lin, Ching-Yung

[4] Naphade, Milind Ramesh

[5] Neti, Chalapathy

[6] Nock, Harriet J.

[7] Smith, John R.

来源：

Adams, W.H. (whadams@us.ibm.com) | 1600年 / Hindawi Publishing Corporation卷 / 2003期

关键词：

Information analysis - Learning systems - Markov processes - Semantics - Statistical methods;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text, Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM), Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.

引用

下载

共 50 条

[1] Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
W. H. Adams
Giridharan Iyengar
Ching-Yung Lin
Milind Ramesh Naphade
Chalapathy Neti
Harriet J. Nock
John R. Smith
EURASIP Journal on Advances in Signal Processing, 2003
[2] Semantic indexing of multimedia content using visual, audio, and text cues
Adams, WH
Iyengar, G
Lin, CY
Naphade, MR
Neti, C
Nock, HJ
Smith, JR
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (02) : 170 - 185
[3] Semantic indexing of multimedia using audio, text and visual cues
Iyengar, G
Nock, H
Neti, C
Franz, M
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A369 - A372
[4] Semantic indexing of multimedia content using textual and visual information
Amrane, A. (amrane@mail.cerist.dz), 1600, Inderscience Enterprises Ltd. (05): : 2 - 3
[5] Audio visual cues for video indexing and retrieval
Muneesawang, P
Amin, T
Guan, L
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 642 - 649
[6] Audio visual cues for video indexing and retrieval
Muneesawang, Paisarn
Amin, Tahir
Guan, Ling
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3331 : 642 - 649
[7] Automatic indexing of multimedia content by integration of audio, spoken language, and visual information
Ohtsuki, K
Bessho, K
Matsuo, Y
Matsunaga, S
Hayashi, Y
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 601 - 606
[8] Audio Clips Content Comparison Using Latent Semantic Indexing
Biatov, Konstantin
Koehler, Joachim
Schneider, Daniel
2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 509 - 512
[9] Semantic representation of multimedia content: Knowledge representation and semantic indexing
Phivos Mylonas
Thanos Athanasiadis
Manolis Wallace
Yannis Avrithis
Stefanos Kollias
Multimedia Tools and Applications, 2008, 39 : 293 - 327
[10] Semantic representation of multimedia content: Knowledge representation and semantic indexing
Mylonas, Phivos
Athanasiadis, Thanos
Wallace, Manolis
Avrithis, Yannis
Kollias, Stefanos
MULTIMEDIA TOOLS AND APPLICATIONS, 2008, 39 (03) : 293 - 327

← 1 2 3 4 5 →