Identification of story units in audio-visual sequences by joint audio and video processing

被引：0

作者：

Saraceno, C ^{[1
]}

Leonardi, R ^{[1
]}

机构：

[1] Univ Brescia, SCL Dept Elect Automat, I-25123 Brescia, Italy

来源：

1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1 | 1998年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, a novel technique, which uses a joint audio-visual analysis for scene identification and characterization, is proposed. The paper defines four different scene types: dialogues, stories, actions, and generic scenes. It then explains how any audio-visual material can be decomposed into a series of scenes obeying to the preview classification, by properly analyzing and then combining the underlying audio and visual information. A rule-based procedure is defined for such purpose. Before such rule-based decision can take place, a series of low-level pre-processing tasks care suggested to adequately measure audio and visual correlations. As far as visual information is concerned, it is proposed to measure similarities between non consecutive shots using a Learning Vector Quantization approach. An outlook on a possible implementation strategy for the overall scene identification task is suggested, and validated through a series of experimental simulations on real audio-visual data.

引用

页码：363 / 367

页数：5

共 50 条

[1] Indexing audio-visual sequences by joint audio and video processing
Saraceno, C
Leonardi, R
[J]. VSMM98: FUTUREFUSION - APPLICATION REALITIES FOR THE VIRTUAL AGE, VOLS 1 AND 2, 1998, : 686 - 691
[2] Video clip recognition using joint audio-visual processing model
Kulesh, V
Petrushin, VA
Sethi, IK
[J]. 16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL I, PROCEEDINGS, 2002, : 500 - 503
[3] Video clip recognition using joint audio-visual processing model
Kulesh, Victor
Petrushin, Valery A.
Sethi, Ishwar K.
[J]. Proceedings - International Conference on Pattern Recognition, 2002, 16 (01): : 500 - 503
[4] Audio-visual event recognition in surveillance video sequences
Cristani, Marco
Bicego, Manuele
Murino, Vittorio
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (02) : 257 - 267
[5] A JOINT AUDIO-VISUAL APPROACH TO AUDIO LOCALIZATION
Jensen, Jesper Rindom
Christensen, Mads Graesboll
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 454 - 458
[6] Special issue on joint audio-visual speech processing
Neti, C
Potamianos, G
Luettin, J
Vatikiotis-Bateson, E
[J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1151 - 1153
[7] VIDEO CAMERA IDENTIFICATION USING AUDIO-VISUAL FEATURES
Milani, S.
Cuccovillo, L.
Tagliasacchi, M.
Tubaro, S.
Aichroth, P.
[J]. 2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
[8] Discovering joint audio-visual codewords for video event detection
Jhuo, I-Hong
Ye, Guangnan
Gao, Shenghua
Liu, Dong
Jiang, Yu-Gang
Lee, D. T.
Chang, Shih-Fu
[J]. MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 33 - 47
[9] VidQ: Video Query Using Optimized Audio-Visual Processing
Felemban, Noor
Mehmeti, Fidan
Porta, Thomas F.
[J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (03) : 1338 - 1352
[10] Audio-visual quality and interactions between television audio and video
Joly, A
Montard, N
Buttin, M
[J]. ISSPA 2001: SIXTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2001, : 438 - 441

← 1 2 3 4 5 →