Audio-visual event recognition in surveillance video sequences

被引:104
|
作者
Cristani, Marco [1 ]
Bicego, Manuele
Murino, Vittorio
机构
[1] Univ Verona, Dipartimento Informat, I-37134 Verona, Italy
[2] Univ Sassari, DEIR, I-07100 Sassari, Italy
关键词
audio-visual analysis; automated surveillance; event classification and clustering; multimodal background modelling and foreground detection; multimodality; scene analysis;
D O I
10.1109/TMM.2006.886263
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the context of the automated surveillance field, automatic scene analysis and understanding systems typically consider only visual information, whereas other modalities, such as audio, are typically disregarded. This paper presents a new method able to integrate audio and visual information for scene analysis in a typical surveillance scenario, using only one camera and one monaural microphone. Visual information is analyzed by a standard visual background/foreground (BG/FG) modelling module, enhanced with a novelty detection stage and coupled with an audio BG/FG modelling scheme. These processes permit one to detect separate audio and visual patterns representing unusual unimodal events in a scene. The integration of audio and visual data is subsequently performed by exploiting the concept of synchrony between such events. The audio-visual (AV) association is carried out on-line and without need for training sequences, and is actually based on the computation of a characteristic feature called audio-video concurrence matrix, allowing one to detect and segment AV events, as well as to discriminate between them. Experimental tests involving classification and clustering of events show all the potentialities of the proposed approach, also in comparison with the results obtained by employing the single modalities and without considering the synchrony issue.
引用
收藏
页码:257 / 267
页数:11
相关论文
共 50 条
  • [31] Audio-Visual Speech Recognition in Noisy Audio Environments
    Palecek, Karel
    Chaloupka, Josef
    2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 484 - 487
  • [32] Audio-visual speech recognition based on joint training with audio-visual speech enhancement for robust speech recognition
    Hwang, Jung-Wook
    Park, Jeongkyun
    Park, Rae-Hong
    Park, Hyung-Min
    APPLIED ACOUSTICS, 2023, 211
  • [33] End-to-End Bloody Video Recognition by Audio-Visual Feature Fusion
    Hou, Congcong
    Wu, Xiaoyu
    Wang, Ge
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT I, 2018, 11256 : 501 - 510
  • [34] Audio-Visual Event Localization in Unconstrained Videos
    Tian, Yapeng
    Shi, Jing
    Li, Bochen
    Duan, Zhiyao
    Xu, Chenliang
    COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 252 - 268
  • [35] Audio-visual quality and interactions between television audio and video
    Joly, A
    Montard, N
    Buttin, M
    ISSPA 2001: SIXTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2001, : 438 - 441
  • [36] Combining audio and video metrics to assess audio-visual quality
    Becerra Martinez, Helard A.
    Farias, Mylene C. Q.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (18) : 23993 - 24012
  • [37] An audio-visual approach to web video categorization
    Ionescu, Bogdan Emanuel
    Seyerlehner, Klaus
    Mironica, Ionut
    Vertan, Constantin
    Lambert, Patrick
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 70 (02) : 1007 - 1032
  • [38] Advertising video as a kind of audio-visual production
    Zarya, Svitlana
    NATIONAL ACADEMY OF MANAGERIAL STAFF OF CULTURE AND ARTS HERALD, 2016, (02): : 94 - 98
  • [39] Audio-visual Privacy Protection for Video Conference
    Venkatesh, M. Vijay
    Zhao, Jian
    Profitt, Larry
    Cheung, Sen-ching S.
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1574 - 1575
  • [40] Combining audio and video metrics to assess audio-visual quality
    Helard A. Becerra Martinez
    Mylène C. Q. Farias
    Multimedia Tools and Applications, 2018, 77 : 23993 - 24012