Multimodal Information Fusion for Semantic Video Analysis

被引:1
|
作者
Gulen, Elvan [1 ]
Yilmaz, Turgay [1 ,2 ]
Yazici, Adnan [1 ]
机构
[1] Middle East Tech Univ, Dept Comp Engn, Ankara, Turkey
[2] Univ Tokyo, Inst Ind Sci, Tokyo, Japan
关键词
Concept Interactions; Multimedia Content Analysis; Multimedia Information; Multimodal Fusion; Semantic Concept Detection;
D O I
10.4018/jmdem.2012100103
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Multimedia data by its very nature contains multimodal information in it. For a successful analysis of multimedia content, all available multimodal information should be utilized. Additionally, since concepts can contain valuable cues about other concepts, concept interaction is a crucial source of multimedia information and helps to increase the fusion performance. The aim of this study is to show that integrating existing modalities along with the concept interactions can yield a better performance in detecting semantic concepts. Therefore, in this paper, the authors present a multimodal fusion approach that integrates semantic information obtained from various modalities along with additional semantic cues. The experiments conducted on TRECVID 2007 and CCV Database datasets validates the superiority of such combination over best single modality and alternative modality combinations. The results show that the proposed fusion approach provides 16.7% relative performance gain on TRECVID dataset and 47.7% relative performance improvement on CCV database over the results of best unimodal approaches.
引用
收藏
页码:52 / 74
页数:23
相关论文
共 50 条
  • [21] Semantic analysis of basketball video using motion information
    Liu, S
    Yi, HR
    Chia, LT
    Rajan, D
    Chan, S
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 65 - 72
  • [22] Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis
    Han, Wei
    Chen, Hui
    Poria, Soujanya
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9180 - 9192
  • [23] Video Reconstruction with Multimodal Information
    Xie, Zhipeng
    Duan, Yiping
    Du, Qiyuan
    Tao, Xiaoming
    Yu, Jiazhong
    2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [24] Multimodal Fusion for Video Search Reranking
    Wei, Shikui
    Zhao, Yao
    Zhu, Zhenfeng
    Liu, Nan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (08) : 1191 - 1199
  • [25] Dynamic multimodal fusion in video search
    Xie, Lexing
    Natsev, Apostol Paul
    Tesic, Jelena
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 1499 - 1502
  • [26] A survey on multimodal video representation for semantic retrieval
    Calic, J
    Campbell, N
    Dasiopoulou, S
    Kompatsiaris, Y
    EUROCON 2005: THE INTERNATIONAL CONFERENCE ON COMPUTER AS A TOOL, VOL 1 AND 2 , PROCEEDINGS, 2005, : 135 - 138
  • [27] MULTIMODAL SEMANTIC ATTENTION NETWORK FOR VIDEO CAPTIONING
    Sun, Liang
    Li, Bing
    Yuan, Chunfeng
    Zha, Zhengjun
    Hu, Weiming
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1300 - 1305
  • [28] MULTIMODAL VIDEO SALIENCY ANALYSIS WITH USER-BIASED INFORMATION
    Xia, Jiangyue
    Tian, Jingqi
    Qiao, Hui
    Li, Yichen
    Wen, Jiangtao
    Han, Yuxing
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [29] Joint Learning for Relationship and Interaction Analysis in Video with Multimodal Feature Fusion
    Zhang, Beibei
    Yu, Fan
    Gao, Yanxin
    Ren, Tongwei
    Wu, Gangshan
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4848 - 4852
  • [30] A short video sentiment analysis model based on multimodal feature fusion
    Shi, Hongyu
    SYSTEMS AND SOFT COMPUTING, 2024, 6