Semantic indexing of multimedia content using visual, audio, and text cues

被引:0
|
作者
机构
[1] Adams, W.H.
[2] Iyengar, Giridharan
[3] Lin, Ching-Yung
[4] Naphade, Milind Ramesh
[5] Neti, Chalapathy
[6] Nock, Harriet J.
[7] Smith, John R.
来源
Adams, W.H. (whadams@us.ibm.com) | 1600年 / Hindawi Publishing Corporation卷 / 2003期
关键词
Information analysis - Learning systems - Markov processes - Semantics - Statistical methods;
D O I
暂无
中图分类号
学科分类号
摘要
We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text, Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM), Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.
引用
下载
收藏
相关论文
共 50 条
  • [1] Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
    W. H. Adams
    Giridharan Iyengar
    Ching-Yung Lin
    Milind Ramesh Naphade
    Chalapathy Neti
    Harriet J. Nock
    John R. Smith
    EURASIP Journal on Advances in Signal Processing, 2003
  • [2] Semantic indexing of multimedia content using visual, audio, and text cues
    Adams, WH
    Iyengar, G
    Lin, CY
    Naphade, MR
    Neti, C
    Nock, HJ
    Smith, JR
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (02) : 170 - 185
  • [3] Semantic indexing of multimedia using audio, text and visual cues
    Iyengar, G
    Nock, H
    Neti, C
    Franz, M
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A369 - A372
  • [4] Semantic indexing of multimedia content using textual and visual information
    Amrane, A. (amrane@mail.cerist.dz), 1600, Inderscience Enterprises Ltd. (05): : 2 - 3
  • [5] Audio visual cues for video indexing and retrieval
    Muneesawang, P
    Amin, T
    Guan, L
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 642 - 649
  • [6] Audio visual cues for video indexing and retrieval
    Muneesawang, Paisarn
    Amin, Tahir
    Guan, Ling
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3331 : 642 - 649
  • [7] Automatic indexing of multimedia content by integration of audio, spoken language, and visual information
    Ohtsuki, K
    Bessho, K
    Matsuo, Y
    Matsunaga, S
    Hayashi, Y
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 601 - 606
  • [8] Audio Clips Content Comparison Using Latent Semantic Indexing
    Biatov, Konstantin
    Koehler, Joachim
    Schneider, Daniel
    2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 509 - 512
  • [9] Semantic representation of multimedia content: Knowledge representation and semantic indexing
    Phivos Mylonas
    Thanos Athanasiadis
    Manolis Wallace
    Yannis Avrithis
    Stefanos Kollias
    Multimedia Tools and Applications, 2008, 39 : 293 - 327
  • [10] Semantic representation of multimedia content: Knowledge representation and semantic indexing
    Mylonas, Phivos
    Athanasiadis, Thanos
    Wallace, Manolis
    Avrithis, Yannis
    Kollias, Stefanos
    MULTIMEDIA TOOLS AND APPLICATIONS, 2008, 39 (03) : 293 - 327