Identification of story units in audio-visual sequences by joint audio and video processing

被引:0
|
作者
Saraceno, C [1 ]
Leonardi, R [1 ]
机构
[1] Univ Brescia, SCL Dept Elect Automat, I-25123 Brescia, Italy
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a novel technique, which uses a joint audio-visual analysis for scene identification and characterization, is proposed. The paper defines four different scene types: dialogues, stories, actions, and generic scenes. It then explains how any audio-visual material can be decomposed into a series of scenes obeying to the preview classification, by properly analyzing and then combining the underlying audio and visual information. A rule-based procedure is defined for such purpose. Before such rule-based decision can take place, a series of low-level pre-processing tasks care suggested to adequately measure audio and visual correlations. As far as visual information is concerned, it is proposed to measure similarities between non consecutive shots using a Learning Vector Quantization approach. An outlook on a possible implementation strategy for the overall scene identification task is suggested, and validated through a series of experimental simulations on real audio-visual data.
引用
收藏
页码:363 / 367
页数:5
相关论文
共 50 条
  • [1] Indexing audio-visual sequences by joint audio and video processing
    Saraceno, C
    Leonardi, R
    [J]. VSMM98: FUTUREFUSION - APPLICATION REALITIES FOR THE VIRTUAL AGE, VOLS 1 AND 2, 1998, : 686 - 691
  • [2] Video clip recognition using joint audio-visual processing model
    Kulesh, V
    Petrushin, VA
    Sethi, IK
    [J]. 16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL I, PROCEEDINGS, 2002, : 500 - 503
  • [3] Video clip recognition using joint audio-visual processing model
    Kulesh, Victor
    Petrushin, Valery A.
    Sethi, Ishwar K.
    [J]. Proceedings - International Conference on Pattern Recognition, 2002, 16 (01): : 500 - 503
  • [4] Audio-visual event recognition in surveillance video sequences
    Cristani, Marco
    Bicego, Manuele
    Murino, Vittorio
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (02) : 257 - 267
  • [5] A JOINT AUDIO-VISUAL APPROACH TO AUDIO LOCALIZATION
    Jensen, Jesper Rindom
    Christensen, Mads Graesboll
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 454 - 458
  • [6] Special issue on joint audio-visual speech processing
    Neti, C
    Potamianos, G
    Luettin, J
    Vatikiotis-Bateson, E
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1151 - 1153
  • [7] VIDEO CAMERA IDENTIFICATION USING AUDIO-VISUAL FEATURES
    Milani, S.
    Cuccovillo, L.
    Tagliasacchi, M.
    Tubaro, S.
    Aichroth, P.
    [J]. 2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
  • [8] Discovering joint audio-visual codewords for video event detection
    Jhuo, I-Hong
    Ye, Guangnan
    Gao, Shenghua
    Liu, Dong
    Jiang, Yu-Gang
    Lee, D. T.
    Chang, Shih-Fu
    [J]. MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 33 - 47
  • [9] VidQ: Video Query Using Optimized Audio-Visual Processing
    Felemban, Noor
    Mehmeti, Fidan
    Porta, Thomas F.
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (03) : 1338 - 1352
  • [10] Audio-visual quality and interactions between television audio and video
    Joly, A
    Montard, N
    Buttin, M
    [J]. ISSPA 2001: SIXTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2001, : 438 - 441