Cost-Effective Solution to Synchronized Audio-Visual Capture using Multiple Sensors

被引:10
|
作者
Lichtenauer, Jeroen [1 ]
Valstar, Michel [1 ]
Shen, Jie [1 ]
Pantic, Maja [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London SW7 2AZ, England
关键词
Video recording; Audio recording; Multisensor systems; Synchronization;
D O I
10.1109/AVSS.2009.92
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Applications such as surveillance and human motion capture require high-bandwidth recording from multiple cameras. Furthermore, the recent increase in research on sensor fusion has raised the demand on synchronization accuracy between video, audio and other sensor modalities. Previously, capturing synchronized, high resolution video from multiple cameras required complex, inflexible and expensive solutions. Our experiments show that a single PC, built from contemporary low-cost computer hardware, could currently handle up to 470MB/s of input data. This allows capturing from 18 cameras of 780x580pixels at 60fps each, or 36 cameras at 30fps. Furthermore, we achieve accurate synchronization between audio, video and additional sensors, by recording audio together with sensor trigger- or timestamp signals, using a multi-channel audio input. In this way, each sensor modality can be captured with separate software and hardware, allowing maximal flexibility with minimal cost.
引用
收藏
页码:324 / 329
页数:6
相关论文
共 50 条
  • [1] Cost-effective solution to synchronised audio-visual data capture using multiple sensors
    Lichtenauer, Jeroen
    Shen, Jie
    Valstar, Michel
    Pantic, Maja
    IMAGE AND VISION COMPUTING, 2011, 29 (10) : 666 - 680
  • [2] Identifying Human Behaviors Using Synchronized Audio-Visual Cues
    Vrigkas, Michalis
    Nikou, Christophoros
    Kakadiaris, Ioannis A.
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2017, 8 (01) : 54 - 66
  • [3] EFFECTIVE GRAPHICS IN AUDIO-VISUAL INSTRUCTION
    MCGHEE, JW
    SCOTTISH EDUCATIONAL STUDIES, 1973, 5 (02): : 128 - 128
  • [4] EFFECTIVE AUDIO-VISUAL PRESENTATION OF DATA
    FASON, J
    JOURNAL OF NUCLEAR MEDICINE, 1964, 5 (05) : 354 - 354
  • [5] USING MULTIPLE VISUAL TANDEM STREAMS IN AUDIO-VISUAL SPEECH RECOGNITION
    Topkaya, Ibrahim Saygin
    Erdogan, Hakan
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4988 - 4991
  • [6] Synchronization of Multiple Camera Videos Using Audio-Visual Features
    Shrestha, Prarthana
    Barbieri, Mauro
    Weda, Hans
    Sekulovski, Dragan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (01) : 79 - 92
  • [7] REFERENCES ON EFFECTIVE TEACHING AND AUDIO-VISUAL METHODS
    GREENFIE.LB
    ENGINEERING EDUCATION, 1970, 60 (07): : 764 - &
  • [8] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection
    Tamura, Satoshi
    Ishikawa, Masato
    Hashiba, Takashi
    Takeuchi, Shin'ichi
    Hayamizu, Satoru
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +
  • [9] COST-EFFECTIVE SOLUTION
    ONEILL, TJ
    DATA PROCESSING, 1974, 16 (01): : 49 - 49
  • [10] 3D Room Geometry Reconstruction Using Audio-Visual Sensors
    Kim, Hansung
    Remaggi, Luca
    Jackson, Philip J. B.
    Fazi, Filippo Maria
    Hilton, Adrian
    PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2017, : 621 - 629