Cross-media retrieval via fusing multi-modality and multi-grained data

被引:0
|
作者
Liu, Z. [1 ,2 ]
Yuan, S. [1 ,2 ]
Pei, X. [1 ,2 ]
Gao, S. [1 ,2 ]
Han, H. [1 ,2 ]
机构
[1] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250014, Shandong, Peoples R China
[2] Shandong Univ Finance & Econ, Shandong Prov Key Lab Digital Media Technol, Jinan 250014, Shandong, Peoples R China
关键词
Cross-media retrieval; Multi-modality data; Multi-grained data; Multi-margin triplet loss; Margin-set;
D O I
10.24200/sci.2023.59834.6456
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Traditional cross-media retrieval methods mainly focus on coarse-grained data that reflect global characteristics while ignoring the fine-grained descriptions of local details. Meanwhile, traditional methods cannot accurately describe the correlations between the anchor and the irrelevant data. This paper aims to solve the abovementioned problems by proposing to fuse coarse-grained and fine-grained features and a multi-margin triplet loss based on a dual-framework. (1) Framework I: A multi-grained data fusion framework based on Deep Belief Network, and (2) Framework II: A multi-modality data fusion framework based on the multi-margin triplet loss function. In Framework I, the coarse-grained and fine-grained features fused by the joint Restricted Boltzmann Machine are input into Framework II. In Framework II, we innovatively propose the multi-margin triplet loss. The data, which belong to different modalities and semantic categories, are stepped away from the anchor in a multi-margin way. Experimental results show that the proposed method achieves better cross-media retrieval performance than other methods with different datasets. Furthermore, the ablation experiments verify that our proposed multi-grained fusion strategy and the multi-margin triplet loss function are effective. (c) 2023 Sharif University of Technology. All rights reserved.
引用
收藏
页码:1645 / 1669
页数:25
相关论文
共 50 条
  • [31] A New Benchmark and Approach for Fine-grained Cross-media Retrieval
    He, Xiangteng
    Peng, Yuxin
    Xie, Liu
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1740 - 1748
  • [32] SentiStory: multi-grained sentiment analysis and event summarization with crowdsourced social media data
    Ouyang, Yi
    Guo, Bin
    Zhang, Jiafan
    Yu, Zhiwen
    Zhou, Xingshe
    PERSONAL AND UBIQUITOUS COMPUTING, 2017, 21 (01) : 97 - 111
  • [33] Search for multi-modality data in digital libraries
    Yang, J
    Zhuang, YT
    Li, Q
    ADVANCES IN MUTLIMEDIA INFORMATION PROCESSING - PCM 2001, PROCEEDINGS, 2001, 2195 : 482 - 489
  • [34] LEARNING OPTIMAL DATA REPRESENTATION FOR CROSS-MEDIA RETRIEVAL
    Zhang, Hong
    Chen, Li
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 1925 - 1928
  • [35] Video-Text Retrieval by Supervised Sparse Multi-Grained Learning
    Wang, Yimu
    Shi, Peng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 633 - 649
  • [36] Multi-modality Medical Case Retrieval Using Heterogeneous Information
    Wu, Menglin
    Sun, Quansen
    INTELLIGENT COMPUTING IN BIOINFORMATICS, 2014, 8590 : 80 - 91
  • [37] Multi-grained encoding and joint embedding space fusion for video and text cross-modal retrieval
    Xiaotao Cui
    Jing Xiao
    Yang Cao
    Jia Zhu
    Multimedia Tools and Applications, 2022, 81 : 34367 - 34386
  • [38] An Approach for Mining Heterogeneous Data for Cross-Media Retrieval
    Pavan, K. Madhu
    Ananthanarayana, V. S.
    2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [39] Image retrieval++ - web image retrieval with an enhanced multi-modality ontology
    Wang, Huan
    Chia, Liang-Tien
    Liu, Song
    MULTIMEDIA TOOLS AND APPLICATIONS, 2008, 39 (02) : 189 - 215
  • [40] Multi-grained encoding and joint embedding space fusion for video and text cross-modal retrieval
    Cui, Xiaotao
    Xiao, Jing
    Cao, Yang
    Zhu, Jia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 34367 - 34386