Fast Seriation of Multiple Homogeneous-content Videos Using Audio-visual Features

被引:0
|
作者
Zeng, Yi-Chong [1 ]
Chang, Wen-Tsung [1 ]
机构
[1] Inst Informat Ind, Adv Res Inst, Taipei, Taiwan
关键词
Video seriation; mel-frequency cepstral coefficients; color; principal component analysis; time synchronizing;
D O I
10.3233/978-1-61499-484-8-1157
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Facing a large number of homogeneous-content videos shared on the internet, it is essential to seriate videos with prior information. In this paper, we propose fast video seriation (ViSA) scheme based on audio-visual features. The features, mel-frequency cepstral coefficients (MFCCs) and color histogram are extracted as audio feature and visual feature, respectively. Principal component analysis (PCA) is exploited for dimensionality reduction as well as time consumption reducing during feature matching. Subsequently, the proposed fast feature matching computes difference between features, in the meantime, estimates time difference between two videos. The chronological ordering establishes relationship among videos, and then seriates videos by synchronizing video time. The experiment results will show the proposed scheme reduces more 50% computing time than brute force approach does in feature matching. The error of time difference between groundtruth and the estimated result is very small. The experiment results will demonstrate that the proposed scheme is efficient in video seriating.
引用
收藏
页码:1157 / 1166
页数:10
相关论文
共 50 条
  • [1] Synchronization of Multiple Camera Videos Using Audio-Visual Features
    Shrestha, Prarthana
    Barbieri, Mauro
    Weda, Hans
    Sekulovski, Dragan
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (01) : 79 - 92
  • [2] Multimodal framework based on audio-visual features for summarisation of cricket videos
    Javed, Ali
    Irtaza, Aun
    Malik, Hafiz
    Mahmood, Muhammad Tariq
    Adnan, Syed
    [J]. IET IMAGE PROCESSING, 2019, 13 (04) : 615 - 622
  • [3] Content Based Identification of Talk Show Videos Using Audio Visual Features
    Muhammad, Atta
    Daudpota, Sher Muhammad
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 267 - 283
  • [4] Summarization of Multiple News Videos Considering the Consistency of Audio-Visual Contents
    Zhang, Ye
    Tanishige, Ryunosuke
    Ide, Ichiro
    Doman, Keisuke
    Kawanishi, Yasutomo
    Deguchi, Daisuke
    Murase, Hiroshi
    [J]. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2019, 13 (01) : 135 - 155
  • [5] Audio-Visual Event Localization in Unconstrained Videos
    Tian, Yapeng
    Shi, Jing
    Li, Bochen
    Duan, Zhiyao
    Xu, Chenliang
    [J]. COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 252 - 268
  • [6] VIDEO CAMERA IDENTIFICATION USING AUDIO-VISUAL FEATURES
    Milani, S.
    Cuccovillo, L.
    Tagliasacchi, M.
    Tubaro, S.
    Aichroth, P.
    [J]. 2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
  • [7] Audio-visual speech recognition using MPEGA compliant visual features
    Aleksic, PS
    Williams, JJ
    Wu, ZL
    Katsaggelos, AK
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1213 - 1227
  • [8] Unified Audio-Visual Saliency Model for Omnidirectional Videos With Spatial Audio
    Zhu, Dandan
    Zhang, Kaiwei
    Zhang, Nana
    Zhou, Qiangqiang
    Min, Xiongkuo
    Zhai, Guangtao
    Yang, Xiaokang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 (26) : 764 - 775
  • [9] Distributed audio-visual content development
    Meliones, A
    Karidis, A
    Perrakis, S
    Siganos, V
    Skelton, C
    [J]. HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 74 - 85
  • [10] Conceptual modeling of audio-visual content
    Smith, JR
    Benitez, AB
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 915 - 918