Cost-Effective Solution to Synchronized Audio-Visual Capture using Multiple Sensors

被引:10
|
作者
Lichtenauer, Jeroen [1 ]
Valstar, Michel [1 ]
Shen, Jie [1 ]
Pantic, Maja [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London SW7 2AZ, England
关键词
Video recording; Audio recording; Multisensor systems; Synchronization;
D O I
10.1109/AVSS.2009.92
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Applications such as surveillance and human motion capture require high-bandwidth recording from multiple cameras. Furthermore, the recent increase in research on sensor fusion has raised the demand on synchronization accuracy between video, audio and other sensor modalities. Previously, capturing synchronized, high resolution video from multiple cameras required complex, inflexible and expensive solutions. Our experiments show that a single PC, built from contemporary low-cost computer hardware, could currently handle up to 470MB/s of input data. This allows capturing from 18 cameras of 780x580pixels at 60fps each, or 36 cameras at 30fps. Furthermore, we achieve accurate synchronization between audio, video and additional sensors, by recording audio together with sensor trigger- or timestamp signals, using a multi-channel audio input. In this way, each sensor modality can be captured with separate software and hardware, allowing maximal flexibility with minimal cost.
引用
收藏
页码:324 / 329
页数:6
相关论文
共 50 条
  • [31] Onmidirectional audio-visual talker localization based on dynamic fusion of audio-visual features using validity and reliability criteria
    Denda, Yuki
    Nishiura, Takanobu
    Yamashita, Yoichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03): : 598 - 606
  • [32] Bird Species Classification with Audio-Visual Data using CNN and Multiple Kernel Learning
    Bold, Naranchimeg
    Zhang, Chao
    Akashi, Takuya
    2019 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW), 2019, : 85 - 88
  • [33] Multiple video clips preservation using folded back audio-visual cryptography scheme
    Mukherjee, Imon
    Ganguly, Ritam
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (05) : 5281 - 5301
  • [34] A cost-effective solution for blog search
    Chen, Lin-Chih
    DATA & KNOWLEDGE ENGINEERING, 2018, 116 : 124 - 137
  • [35] Multiple camera in car audio-visual speech recognition using phonetic and visemic information
    Biswas, Astik
    Sahu, P. K.
    Chandra, Mahesh
    COMPUTERS & ELECTRICAL ENGINEERING, 2015, 47 : 35 - 50
  • [36] An efficient, cost-effective and simple solution
    不详
    BRITISH DENTAL JOURNAL, 2019, 227 (12) : 1073 - 1073
  • [37] NEUROCOMPUTER - CRASY A COST-EFFECTIVE SOLUTION
    GUERIN, A
    JUTTEN, C
    HERAULT, J
    NEURAL NETWORKS FROM MODELS TO APPLICATIONS, 1989, : 756 - 765
  • [38] Optimizing Audio-Visual Speech Enhancement Using Multi-Level Distortion Measures for Audio-Visual Speech Recognition
    Chen, Hang
    Wang, Qing
    Du, Jun
    Yin, Bao-Cai
    Pan, Jia
    Lee, Chin-Hui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2508 - 2521
  • [39] An efficient, cost-effective and simple solution
    British Dental Journal, 2019, 227 : 1073 - 1073
  • [40] Multiple cameras for audio-visual speech recognition in an automotive environment
    Navarathna, Rajitha
    Dean, David
    Sridharan, Sridha
    Lucey, Patrick
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (04): : 911 - 927