Context-Dependent Audio-Visual and Temporal Features Fusion for TV Commercial Detection

被引:0
|
作者
Zhang, Bo [1 ]
Zou, Jiancheng [2 ]
Xu, Bo [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
[2] North China Univ Technol, Coll Sci, Beijing 100041, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Automatic TV commercial block detection is a key component of an intelligent commercial management system. Rather than utilizing exclusively audio-visual characteristics like most previous works, We have proposed a SVM-DP scheme to collaboratively exploit audio-visual and global temporal characteristics associated with commercials. Firstly, likelihood values of commercial and general program are calculated by context-dependent audio-visual features and SVM-based classifiers for each video shot. And then, these values are considered as observations of a two states markov chain, providing assistance for merging shots into blocks. At last, Minimum Duration Constraint (MDC) and Maximum Segment Constraint (MSC) which grasp the global temporal characteristics are presented to search optimal combination path with Dynamic Programming approaches, respectively. Experiments performed on real video data from TV channels in China show the effectiveness of the proposed scheme.
引用
收藏
页码:5 / 8
页数:4
相关论文
共 50 条
  • [1] Fusing audio-visual fingerprint to detect TV commercial advertisement
    Ouyang, Jian-quan
    Nie, Hua
    Zhang, Min
    Li, Zezhou
    Li, Yongzhou
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2011, 37 (06) : 991 - 1008
  • [2] Transfer of Audio-Visual Temporal Training to Temporal and Spatial Audio-Visual Tasks
    Suerig, Ralf
    Bottari, Davide
    Roeder, Brigitte
    [J]. MULTISENSORY RESEARCH, 2018, 31 (06) : 556 - 578
  • [3] Detection of documentary scene changes by audio-visual fusion
    Velivelli, A
    Ngo, CW
    Huang, TS
    [J]. IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2003, 2728 : 227 - 237
  • [4] Temporal Feature Prediction in Audio-Visual Deepfake Detection
    Gao, Yuan
    Wang, Xuelong
    Zhang, Yu
    Zeng, Ping
    Ma, Yingjie
    [J]. ELECTRONICS, 2024, 13 (17)
  • [5] Empirical Study of Audio-Visual Features Fusion for Gait Recognition
    Castro, Francisco M.
    Marin-Jimenez, Manuel J.
    Guil, Nicolas
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT I, 2015, 9256 : 727 - 739
  • [6] Onmidirectional audio-visual talker localization based on dynamic fusion of audio-visual features using validity and reliability criteria
    Denda, Yuki
    Nishiura, Takanobu
    Yamashita, Yoichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03): : 598 - 606
  • [7] Temporal Cue Guided Video Highlight Detection with Low-Rank Audio-Visual Fusion
    Ye, Qinghao
    Shen, Xiyue
    Gao, Yuan
    Wang, Zirui
    Bi, Qi
    Li, Ping
    Yang, Guang
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7930 - 7939
  • [8] Sound Logo to Increase TV Advertising Effectiveness Based on Audio-Visual Features
    Seto, Kazuki
    Asahi, Yumi
    [J]. HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION, HIMI 2023, PT I, 2023, 14015 : 136 - 151
  • [9] Decision-Level Fusion for Audio-Visual Laughter Detection
    Reuderink, Boris
    Poel, Alannes
    Truong, Khiet
    Poppe, Ronald
    Pantic, Maja
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 137 - 148
  • [10] Audio-Visual Fusion With Temporal Convolutional Attention Network for Speech Separation
    Liu, Debang
    Zhang, Tianqi
    Christensen, Mads Graesboll
    Yi, Chen
    An, Zeliang
    [J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2024, 32 : 4647 - 4660