An audio-visual saliency model for movie summarization

被引:8
|
作者
Rapantzikos, Konstantinos [1 ]
Evangelopoulos, Georgios [1 ]
Maragos, Petros [1 ]
Avrithis, Yannis [1 ]
机构
[1] Natl Tech Univ Athens, Sch ECE, GR-15773 Athens, Greece
关键词
saliency; saliency curves; attention modeling; event detection; key-frame selection; video summarization; audiovisual;
D O I
10.1109/MMSP.2007.4412882
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A saliency-based method for generating video summaries is presented, which exploits coupled audiovisual information from both media streams. Efficient and advanced speech and image processing algorithms to detect key frames that are acoustically and visually salient are used. Promising results are shown from experiments on a movie database.
引用
收藏
页码:320 / 323
页数:4
相关论文
共 50 条
  • [31] MOVIE SUMMARIZATION BASED ON AUDIOVISUAL SALIENCY DETECTION
    Evangelopoulos, G.
    Rapantzikos, K.
    Potamianos, A.
    Maragos, P.
    Zlatintsi, A.
    Avrithis, Y.
    [J]. 2008 15TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-5, 2008, : 2528 - 2531
  • [32] An audio-visual distance for audio-visual speech vector quantization
    Girin, L
    Foucher, E
    Feng, G
    [J]. 1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 523 - 528
  • [33] Catching audio-visual mice:: The extrapolation of audio-visual speed
    Hofbauer, MM
    Wuerger, SM
    Meyer, GF
    Röhrbein, F
    Schill, K
    Zetzsche, C
    [J]. PERCEPTION, 2003, 32 : 96 - 96
  • [34] Affective Audio-Visual Words and Latent Topic Driving Model for Realizing Movie Affective Scene Classification
    Irie, Go
    Satou, Takashi
    Kojima, Akira
    Yamasaki, Toshihiko
    Aizawa, Kiyoharu
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (06) : 523 - 535
  • [35] Voicing influences the saliency of place of articulation in audio-visual speech perception in babble
    Alm, Magnus
    Behne, Dawn
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2865 - 2868
  • [36] A manually denoised audio-visual movie watching fMRI dataset for the studyforrest project
    Xingyu Liu
    Zonglei Zhen
    Anmin Yang
    Haohao Bai
    Jia Liu
    [J]. Scientific Data, 6
  • [37] A manually denoised audio-visual movie watching fMRI dataset for the studyforrest project
    Liu, Xingyu
    Zhen, Zonglei
    Yang, Anmin
    Bai, Haohao
    Liu, Jia
    [J]. SCIENTIFIC DATA, 2019, 6 (1)
  • [38] An audio-visual speech recognition with a new mandarin audio-visual database
    Liao, Wen-Yuan
    Pao, Tsang-Long
    Chen, Yu-Te
    Chang, Tsun-Wei
    [J]. INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
  • [39] A ROBUST AUDIO-VISUAL SPEECH ENHANCEMENT MODEL
    Wang, Wupeng
    Xing, Chao
    Wang, Dong
    Chen, Xiao
    Sun, Fengyu
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7529 - 7533
  • [40] AUDIO-VISUAL EDUCATION
    Brickman, William W.
    [J]. SCHOOL AND SOCIETY, 1948, 67 (1739): : 320 - 326