Cross-media retrieval via fusing multi-modality and multi-grained data

被引:0
|
作者
Liu, Z. [1 ,2 ]
Yuan, S. [1 ,2 ]
Pei, X. [1 ,2 ]
Gao, S. [1 ,2 ]
Han, H. [1 ,2 ]
机构
[1] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250014, Shandong, Peoples R China
[2] Shandong Univ Finance & Econ, Shandong Prov Key Lab Digital Media Technol, Jinan 250014, Shandong, Peoples R China
关键词
Cross-media retrieval; Multi-modality data; Multi-grained data; Multi-margin triplet loss; Margin-set;
D O I
10.24200/sci.2023.59834.6456
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Traditional cross-media retrieval methods mainly focus on coarse-grained data that reflect global characteristics while ignoring the fine-grained descriptions of local details. Meanwhile, traditional methods cannot accurately describe the correlations between the anchor and the irrelevant data. This paper aims to solve the abovementioned problems by proposing to fuse coarse-grained and fine-grained features and a multi-margin triplet loss based on a dual-framework. (1) Framework I: A multi-grained data fusion framework based on Deep Belief Network, and (2) Framework II: A multi-modality data fusion framework based on the multi-margin triplet loss function. In Framework I, the coarse-grained and fine-grained features fused by the joint Restricted Boltzmann Machine are input into Framework II. In Framework II, we innovatively propose the multi-margin triplet loss. The data, which belong to different modalities and semantic categories, are stepped away from the anchor in a multi-margin way. Experimental results show that the proposed method achieves better cross-media retrieval performance than other methods with different datasets. Furthermore, the ablation experiments verify that our proposed multi-grained fusion strategy and the multi-margin triplet loss function are effective. (c) 2023 Sharif University of Technology. All rights reserved.
引用
收藏
页码:1645 / 1669
页数:25
相关论文
共 50 条
  • [21] Multi-modal Mutual Topic Reinforce Modeling for Cross-media Retrieval
    Wang, Yanfei
    Wu, Fei
    Song, Jun
    Li, Xi
    Zhuang, Yueting
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 307 - 316
  • [22] Cross-media Hash Retrieval Using Multi-head Attention Network
    Li, Zhixin
    Ling, Feng
    Xu, Chuansheng
    Zhang, Canlong
    Ma, Huifang
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1290 - 1297
  • [23] Complementarity is the king: Multi-modal and multi-grained hierarchical semantic enhancement network for cross-modal retrieval
    Pei, Xinlei
    Liu, Zheng
    Gao, Shanshan
    Su, Yijun
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 216
  • [24] XlanV Model with Adaptively Multi-Modality Feature Fusing for Video Captioning
    Huang, Yiqing
    Cai, Qiuyu
    Xu, Siyu
    Chen, Jiansheng
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4600 - 4604
  • [25] MOON: Multi-hash codes joint learning for cross-media retrieval
    Zhang, Donglin
    Wu, Xiao-Jun
    Yin, He-Feng
    Kittler, Josef
    PATTERN RECOGNITION LETTERS, 2021, 151 : 19 - 25
  • [26] COUPLED FEATURE SELECTION FOR MODALITY-DEPENDENT CROSS-MEDIA RETRIEVAL
    Yu, En
    Sun, Jiande
    Wang, Li
    Zhang, Huaxiang
    Li, Jing
    2017 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2017), 2017, : 315 - 320
  • [27] SentiStory: multi-grained sentiment analysis and event summarization with crowdsourced social media data
    Yi Ouyang
    Bin Guo
    Jiafan Zhang
    Zhiwen Yu
    Xingshe Zhou
    Personal and Ubiquitous Computing, 2017, 21 : 97 - 111
  • [28] Semi-supervised modality-dependent cross-media retrieval
    Dong, Xiao
    Sun, Jiande
    Duan, Peiyong
    Meng, Lili
    Tan, Yanyan
    Wan, Wenbo
    Wu, Hongchen
    Zhang, Bin
    Zhang, Huaxiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (03) : 3579 - 3595
  • [29] Semi-supervised modality-dependent cross-media retrieval
    Xiao Dong
    Jiande Sun
    Peiyong Duan
    Lili Meng
    Yanyan Tan
    Wenbo Wan
    Hongchen Wu
    Bin Zhang
    Huaxiang Zhang
    Multimedia Tools and Applications, 2018, 77 : 3579 - 3595
  • [30] Temporal refinement and multi-grained matching for moment retrieval and highlight detection
    Zhu, Cunjuan
    Zhang, Yanyi
    Jia, Qi
    Wang, Weimin
    Liu, Yu
    MULTIMEDIA SYSTEMS, 2025, 31 (01)