Multimodal Fusion for Video Search Reranking

被引:56
|
作者
Wei, Shikui [1 ]
Zhao, Yao [1 ]
Zhu, Zhenfeng [1 ]
Liu, Nan [1 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; image/video retrieval; multimedia databases; RETRIEVAL;
D O I
10.1109/TKDE.2009.145
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analysis on click-through data from a very large search engine log shows that users are usually interested in the top-ranked portion of returned search results. Therefore, it is crucial for search engines to achieve high accuracy on the top-ranked documents. While many methods exist for boosting video search performance, they either pay less attention to the above factor or encounter difficulties in practical applications. In this paper, we present a flexible and effective reranking method, called CR-Reranking, to improve the retrieval effectiveness. To offer high accuracy on the top-ranked results, CR-Reranking employs a cross-reference (CR) strategy to fuse multimodal cues. Specifically, multimodal features are first utilized separately to rerank the initial returned results at the cluster level, and then all the ranked clusters from different modalities are cooperatively used to infer the shots with high relevance. Experimental results show that the search quality, especially on the top-ranked results, is improved significantly.
引用
收藏
页码:1191 / 1199
页数:9
相关论文
共 50 条
  • [1] Dynamic multimodal fusion in video search
    Xie, Lexing
    Natsev, Apostol Paul
    Tesic, Jelena
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 1499 - 1502
  • [2] VIDEO SEARCH RERANKING VIA ONLINE ORDINAL RERANKING
    Yang, Yi-Hsuan
    Hsu, Winston H.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 285 - 288
  • [3] Adaptive Learning for Multimodal Fusion in Video Search
    Lee, Wen-Yu
    Wu, Po-Tun
    Hsu, Winston
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2009, 2009, 5879 : 659 - 670
  • [4] Optimizing Multimodal Reranking for Web Image Search
    Li, Hao
    Wang, Meng
    Li, Zhisheng
    Zha, Zheng-Jun
    Shen, Jialie
    [J]. PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1119 - 1120
  • [5] Online Reranking via Ordinal Informative Concepts for Context Fusion in Concept Detection and Video Search
    Yang, Yi-Hsuan
    Hsu, Winston H.
    Chen, Homer H.
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2009, 19 (12) : 1880 - 1890
  • [6] Fusion of Multimodal Embeddings for Ad-Hoc Video Search
    Francis, Danny
    Phuong Anh Nguyen
    Huet, Benoit
    Chong-Wah Ngo
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1868 - 1872
  • [7] Sparse Transfer Learning for Interactive Video Search Reranking
    Tian, Xinmei
    Tao, Dacheng
    Rui, Yong
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2012, 8 (03) : 1 - 19
  • [8] Multimodal Graph-Based Reranking for Web Image Search
    Wang, Meng
    Li, Hao
    Tao, Dacheng
    Lu, Ke
    Wu, Xindong
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (11) : 4649 - 4661
  • [9] Multimodal-Based Supervised Learning for Image Search Reranking
    Zhao, Shengnan
    Ma, Jun
    Cui, Chaoran
    [J]. WEB-AGE INFORMATION MANAGEMENT (WAIM 2015), 2015, 9098 : 135 - 147
  • [10] TRAFMEL: Multimodal Entity Linking Based on Transformer Reranking and Multimodal Co-Attention Fusion
    Zhang, Xiaoming
    Meng, Kaikai
    Wang, Huiyong
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2024, 34 (06) : 973 - 997