Recognizing high-level audio-visual concepts using context

被引：0

作者：

Naphade, MR ^{[1
]}

Huang, TS ^{[1
]}

机构：

[1] Univ Illinois, Dept Elect & Comp Engn, Coordinated Sci Lab, Urbana, IL 61801 USA

来源：

2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS | 2001年

关键词：

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recognition of high-level semantics from audio-visual data is a challenging multimedia understanding problem The difficulty mainly lies in the gap that exists between low level media features and high level semantic concepts In an attempt to bridge this gap we proposed a probabilistic framework for semantic understanding [6, 5] The components of this framework are probabilistic multimedia objects and a graphical network of such objects In this paper we show how the framework supports detection of multiple high-level concepts, which enjoy spatial and temporal support More importantly, we show why context matters and how it can be modeled Using a factor graph framework, we model context and use it to improve detection of sites, objects and events Using concepts Outdoor and flying-helicopter we demonstrate how the factor graph multinet models context Using ROC curves and probability of error curves we support the intuition that context should help.

引用

页码：46 / 49

页数：4

共 50 条

[21] Audio-vision: Using audio-visual synchrony to locate sounds
Hershey, J
Movellan, J
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 813 - 819
[22] Onmidirectional audio-visual talker localization based on dynamic fusion of audio-visual features using validity and reliability criteria
Denda, Yuki
Nishiura, Takanobu
Yamashita, Yoichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03): : 598 - 606
[23] An Experiment of Software Quality Evaluation in the Audio-visual Media Preservation Context
Biscoglio, Isabella
Marchetti, Eda
2014 9TH INTERNATIONAL CONFERENCE ON THE QUALITY OF INFORMATION AND COMMUNICATIONS TECHNOLOGY (QUATIC), 2014, : 118 - 123
[24] The effect of context and audio-visual modality on emotions elicited by a musical performance
Coutinho, Eduardo
Scherer, Klaus R.
PSYCHOLOGY OF MUSIC, 2017, 45 (04) : 550 - 569
[25] Modelling Stochastic Context of Audio-Visual Expressive Behaviour With Affective Processes
Tellamekala, Mani Kumar
Giesbrecht, Timo
Valstar, Michel
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2290 - 2303
[26] Audio-Visual and Meaningful Semantic Context Enhancements in Older and Younger Adults
Smayda, Kirsten E.
Van Engen, Kristin J.
Maddox, W. Todd
Chandrasekaran, Bharath
PLOS ONE, 2016, 11 (03):
[27] Measures of EEG in the context of long-term audio-visual stimulation
Teplan, M
Krakovská, A
Stolc, S
Measurement 2005, Proceedings, 2005, : 225 - 228
[28] High-level Tracking using Bayesian Context Fusion
de Oude, P.
Pavlin, G.
de Villiers, J. P.
2018 21ST INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2018, : 1415 - 1422
[29] Audio-visual speech recognition using lstm and cnn
El Maghraby E.E.
Gody A.M.
Farouk M.H.
Recent Advances in Computer Science and Communications, 2021, 14 (06) : 2023 - 2039
[30] Joint Audio-Visual Tracking Using Particle Filters
Dmitry N. Zotkin
Ramani Duraiswami
Larry S. Davis
EURASIP Journal on Advances in Signal Processing, 2002

← 1 2 3 4 5 →