Audio-visual events for multi-camera synchronization

被引:6
|
作者
Casanovas, Anna Llagostera [1 ]
Cavallaro, Andrea [2 ]
机构
[1] SwissQual AG, Zuchwil, Switzerland
[2] Queen Mary Univ London, Ctr Intelligent Sensing, London, England
基金
瑞士国家科学基金会; 英国工程与自然科学研究理事会;
关键词
Audio-visual processing; Multiple cameras; Synchronization; Event detection; ALIGNMENT; SPEECH;
D O I
10.1007/s11042-014-1872-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a multimodal method for the automatic synchronization of audio-visual recordings captured with a set of independent cameras. The proposed method jointly processes data from audio and video channels to estimate inter-camera delays that are used to temporally align the recordings. Our approach is composed of three main steps. First we extract from each recording temporally sharp audio-visual events. These audio-visual events are short and characterized by an audio onset happening jointly to a well-localized spatio-temporal change in the video data. Then, we estimate the inter-camera delays by assessing the co-occurrence of the events in the various recordings. Finally, we use a cross-validation procedure that combines the results for all camera pairs and aligns the recordings in a global timeline. An important feature of the proposed method is the estimation of the confidence level on the results that allows us to automatically reject recordings that are not reliable for the alignment. Results show that our method outperforms state-of-the-art approaches based on audio-only or video-only analysis with both fixed and hand-held moving cameras.
引用
收藏
页码:1317 / 1340
页数:24
相关论文
共 50 条
  • [1] Audio-visual events for multi-camera synchronization
    Anna Llagostera Casanovas
    Andrea Cavallaro
    [J]. Multimedia Tools and Applications, 2015, 74 : 1317 - 1340
  • [2] Synchronization of Multiple Camera Videos Using Audio-Visual Features
    Shrestha, Prarthana
    Barbieri, Mauro
    Weda, Hans
    Sekulovski, Dragan
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (01) : 79 - 92
  • [3] AUDIO-VISUAL SYNCHRONIZATION RECOVERY IN MULTIMEDIA CONTENT
    Lee, Jong-Seok
    Ebrahimi, Touradj
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2280 - 2283
  • [4] The role of audio-visual congruence in discrimination of visual events
    Sinico, M
    [J]. PERCEPTION, 2004, 33 : 141 - 141
  • [5] Action-Based Multi-Camera Synchronization
    Zini, Luca
    Cavallaro, Andrea
    Odone, Francesca
    [J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2013, 3 (02) : 165 - 174
  • [6] Rhythmic synchronization tapping to an audio-visual metronome in budgerigars
    Hasegawa, Ai
    Okanoya, Kazuo
    Hasegawa, Toshikazu
    Seki, Yoshimasa
    [J]. SCIENTIFIC REPORTS, 2011, 1
  • [7] RETHINKING AUDIO-VISUAL SYNCHRONIZATION FOR ACTIVE SPEAKER DETECTION
    Wuerkaixi, Abudukelimu
    Zhang, You
    Duan, Zhiyao
    Zhang, Changshui
    [J]. 2022 IEEE 32ND INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2022,
  • [8] VIDEO CAMERA IDENTIFICATION USING AUDIO-VISUAL FEATURES
    Milani, S.
    Cuccovillo, L.
    Tagliasacchi, M.
    Tubaro, S.
    Aichroth, P.
    [J]. 2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
  • [9] On the Audio-visual Synchronization for Lip-to-Speech Synthesis
    Niu, Zhe
    Mak, Brian
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 7809 - 7818
  • [10] Multi-Speaker Audio-Visual Corpus RUSAVIC: Russian Audio-Visual Speech in Cars
    Ivanko, Denis
    Ryumin, Dmitry
    Axyonov, Alexandr
    Kashevnik, Alexey
    Karpov, Alexey
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1555 - 1559