VIDEO EVENT DETECTION AND SUMMARIZATION USING AUDIO, VISUAL AND TEXT SALIENCY

被引:31
|
作者
Evangelopoulos, G. [1 ]
Zlatintsi, A. [1 ]
Skoumas, G. [2 ]
Rapantzikos, K. [1 ]
Potamianos, A. [2 ]
Maragos, P. [1 ]
Avrithis, Y. [1 ]
机构
[1] Natl Tech Univ Athens, Sch ECE, GR-15773 Athens, Greece
[2] Tech Univ Crete, Dept ECE, Khania EL-73100, Greece
关键词
multimodal saliency; audio; video; text processing; video abstraction; movie summarization; ATTENTION MODEL;
D O I
10.1109/ICASSP.2009.4960393
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Detection of perceptually important video events is formulated here on the basis of saliency models for the audio, visual and textual information conveyed in a video stream. Audio saliency is assessed by cues that quantify multifrequency waveform modulations, extracted through nonlinear operators and energy tracking. Visual saliency is measured through a spatiotemporal attention model driven by intensity, color and motion. Text saliency is extracted from part-of-speech tagging on the subtitles information available with most movie distributions. The various modality curves are integrated in a single attention curve, where the presence of an event may be signified in one or multiple domains. This multimodal saliency curve is the basis of a bottom-up video summarization algorithm, that refines results from unimodal or audiovisual-based skimming. The algorithm performs favorably for video summarization in terms of informativeness and enjoyability.
引用
收藏
页码:3553 / +
页数:2
相关论文
共 50 条
  • [21] Visual-Auditory saliency detection using event-driven visual sensors
    Akolkar, Himanshu
    Valeiras, David Reverter
    Benosman, Ryad
    Bartolozzi, Chiara
    PROCEEDINGS OF FIRST INTERNATIONAL CONFERENCE ON EVENT-BASED CONTROL, COMMUNICATION AND SIGNAL PROCESSING EBCCSP 2015, 2015,
  • [22] Goal detection in soccer video using audio/visual keywords
    Kang, YL
    Lim, JH
    Kankanhalli, MS
    Xu, CS
    Tian, Q
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 1629 - 1632
  • [23] Video Summarization using Text Subjectivity Classification
    Moraes, Leonardo
    Marcacini, Ricardo Marcondes
    Goularte, Rudinei
    PROCEEDINGS OF THE 28TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, WEBMEDIA 2022, 2022, : 133 - 141
  • [24] Integrating visual, audio and text analysis for news video
    Qi, W
    Gu, L
    Jiang, H
    Chen, XR
    Zhang, HJ
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 520 - 523
  • [25] ROLE OF AUDIO IN VIDEO SUMMARIZATION
    Shoer, Ibrahim
    Kopru, Berkay
    Erzin, Engin
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [26] A Multimodal Approach for Detection and Assessment of Depression Using Text, Audio and Video
    Zhang, Wei
    Mao, Kaining
    Chen, Jie
    PHENOMICS, 2024, 4 (3): : 234 - 249
  • [27] HUMAN VISUAL FIELD BASED SALIENCY PREDICTION METHOD USING EYE TRACKER DATA FOR VIDEO SUMMARIZATION
    Salehin, Md. Musfequs
    Paul, Manoranjan
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2016,
  • [28] Attention-Based Audio-Visual Fusion for Video Summarization
    Fang, Yinghong
    Zhang, Junpeng
    Lu, Cewu
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 328 - 340
  • [29] Encoding Concept Prototypes for Video Event Detection and Summarization
    Mazloom, Masoud
    Habibian, Amirhossein
    Liu, Dong
    Snoek, Cees G. M.
    Chang, Shih-Fu
    ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 123 - 130
  • [30] A Comprehensive Survey on Video Saliency Detection With Auditory Information: The Audio-Visual Consistency Perceptual is the Key!
    Chen, Chenglizhao
    Song, Mengke
    Song, Wenfeng
    Guo, Li
    Jian, Muwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 457 - 477