Recognizing high-level audio-visual concepts using context

被引：0

作者：

Naphade, MR ^{[1
]}

Huang, TS ^{[1
]}

机构：

[1] Univ Illinois, Dept Elect & Comp Engn, Coordinated Sci Lab, Urbana, IL 61801 USA

来源：

2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS | 2001年

关键词：

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recognition of high-level semantics from audio-visual data is a challenging multimedia understanding problem The difficulty mainly lies in the gap that exists between low level media features and high level semantic concepts In an attempt to bridge this gap we proposed a probabilistic framework for semantic understanding [6, 5] The components of this framework are probabilistic multimedia objects and a graphical network of such objects In this paper we show how the framework supports detection of multiple high-level concepts, which enjoy spatial and temporal support More importantly, we show why context matters and how it can be modeled Using a factor graph framework, we model context and use it to improve detection of sites, objects and events Using concepts Outdoor and flying-helicopter we demonstrate how the factor graph multinet models context Using ROC curves and probability of error curves we support the intuition that context should help.

引用

页码：46 / 49

页数：4

共 50 条

[1] Comparison of low- and high-level visual features for audio-visual continuous automatic speech recognition
Aleksic, PS
Katsaggelos, AK
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 917 - 920
[2] Informative subspaces for audio-visual processing: High-level function from low-level fusion
Fisher, JW
Darrell, T
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4104 - 4107
[3] Recognizing emotions for the audio-visual document indexing
Le, XH
Quénot, G
Castelli, E
ISCC2004: NINTH INTERNATIONAL SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 580 - 584
[4] Mimi4x: an interactive audio-visual installation for high-level structural improvisation
Francois, Alexandre R. J.
Schankler, Isaac
Chew, Elaine
INTERNATIONAL JOURNAL OF ARTS AND TECHNOLOGY, 2013, 6 (02) : 138 - 151
[5] MIMI4X: AN INTERACTIVE AUDIO-VISUAL INSTALLATION FOR HIGH-LEVEL STRUCTURAL IMPROVISATION
Francois, Alexandre R. J.
Schankler, Isaac
Chew, Elaine
2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 1618 - 1623
[6] Using Visual Context and Region Semantics for High-Level Concept Detection
Mylonas, Phivos
Spyrou, Evaggelos
Avrithis, Yannis
Kollias, Stefanos
IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (02) : 229 - 243
[7] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection
Tamura, Satoshi
Ishikawa, Masato
Hashiba, Takashi
Takeuchi, Shin'ichi
Hayamizu, Satoru
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +
[8] Learning high-level visual concepts using attributed primitives and genetic programming
Krawiec, Krzysztof
APPLICATIONS OF EVOLUTIONARY COMPUTING, PROCEEDINGS, 2006, 3907 : 515 - 519
[9] AUDIO-VISUAL PROGRAMMING FOR THE PIANO CLASS + INCLUDING LESSON PLAN USING AUDIO-VISUAL MEDIA
LANCASTER, EL
CLAVIER, 1976, 15 (05): : 28 - 33
[10] The future of spirituality in the context of immersive audio-visual media Bible's imagery as immersive audio-visual media experience
Herteliu, Agnos-Millian
CROIRE EN LA TECHNOLOGIE: MEDIATISATION DU FUTUR ET FUTUR DE LA MEDIATISATION, 2018, : 332 - 349

← 1 2 3 4 5 →