Multimodal Information Fusion for Semantic Video Analysis

被引:1
|
作者
Gulen, Elvan [1 ]
Yilmaz, Turgay [1 ,2 ]
Yazici, Adnan [1 ]
机构
[1] Middle East Tech Univ, Dept Comp Engn, Ankara, Turkey
[2] Univ Tokyo, Inst Ind Sci, Tokyo, Japan
关键词
Concept Interactions; Multimedia Content Analysis; Multimedia Information; Multimodal Fusion; Semantic Concept Detection;
D O I
10.4018/jmdem.2012100103
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Multimedia data by its very nature contains multimodal information in it. For a successful analysis of multimedia content, all available multimodal information should be utilized. Additionally, since concepts can contain valuable cues about other concepts, concept interaction is a crucial source of multimedia information and helps to increase the fusion performance. The aim of this study is to show that integrating existing modalities along with the concept interactions can yield a better performance in detecting semantic concepts. Therefore, in this paper, the authors present a multimodal fusion approach that integrates semantic information obtained from various modalities along with additional semantic cues. The experiments conducted on TRECVID 2007 and CCV Database datasets validates the superiority of such combination over best single modality and alternative modality combinations. The results show that the proposed fusion approach provides 16.7% relative performance gain on TRECVID dataset and 47.7% relative performance improvement on CCV database over the results of best unimodal approaches.
引用
收藏
页码:52 / 74
页数:23
相关论文
共 50 条
  • [31] Analysis of multimodal data fusion from an information theory perspective
    Dai, Yinglong
    Yan, Zheng
    Cheng, Jiangchang
    Duan, Xiaojun
    Wang, Guojun
    INFORMATION SCIENCES, 2023, 623 : 164 - 183
  • [32] Statistical Motion Information Extraction and Representation for Semantic Video Analysis
    Papadopoulos, Georgios Th.
    Briassouli, Alexia
    Mezaris, Vasileios
    Kompatsiaris, Ioannis
    Strintzis, Michael G.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2009, 19 (10) : 1513 - 1528
  • [33] Semantic Difference and Performance Difference Analysis Method for Power Multimodal Data fusion
    Wang, Hongxia
    Wang, Bo
    Dong, Xuzhu
    Yao, Liangzhong
    Zhang, Jiaxin
    Ma, Hengrui
    Gaodianya Jishu/High Voltage Engineering, 2024, 50 (09): : 4037 - 4047
  • [34] Joint modality fusion and temporal context exploitation for semantic video analysis
    Georgios Th Papadopoulos
    Vasileios Mezaris
    Ioannis Kompatsiaris
    Michael G. Strintzis
    EURASIP Journal on Advances in Signal Processing, 2011
  • [35] Semantic analysis based on fusion of audio/visual features for soccer video
    Wang, Zengkai
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY, 2021, 183 : 563 - 571
  • [36] Joint modality fusion and temporal context exploitation for semantic video analysis
    Papadopoulos, Georgios Th
    Mezaris, Vasileios
    Kompatsiaris, Ioannis
    Strintzis, Michael G.
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,
  • [37] Multimodal Keyless Attention Fusion for Video Classification
    Long, Xiang
    Gan, Chuang
    de Melo, Gerard
    Liu, Xiao
    Li, Yandong
    Li, Fu
    Wen, Shilei
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7202 - 7209
  • [38] Adaptive Learning for Multimodal Fusion in Video Search
    Lee, Wen-Yu
    Wu, Po-Tun
    Hsu, Winston
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2009, 2009, 5879 : 659 - 670
  • [39] Multimodal semantic analysis with regularized semantic autoencoder
    Malik, Shaily
    Bansal, Poonam
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (02) : 909 - 917
  • [40] Multimodal data fusion for video scene segmentation
    Parshin, V
    Paradzinets, A
    Chen, LM
    VISUAL INFORMATION AND INFORMATION SYSTEMS, 2006, 3736 : 279 - 289