Cross-media retrieval via fusing multi-modality and multi-grained data

被引:0
|
作者
Liu, Z. [1 ,2 ]
Yuan, S. [1 ,2 ]
Pei, X. [1 ,2 ]
Gao, S. [1 ,2 ]
Han, H. [1 ,2 ]
机构
[1] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250014, Shandong, Peoples R China
[2] Shandong Univ Finance & Econ, Shandong Prov Key Lab Digital Media Technol, Jinan 250014, Shandong, Peoples R China
关键词
Cross-media retrieval; Multi-modality data; Multi-grained data; Multi-margin triplet loss; Margin-set;
D O I
10.24200/sci.2023.59834.6456
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Traditional cross-media retrieval methods mainly focus on coarse-grained data that reflect global characteristics while ignoring the fine-grained descriptions of local details. Meanwhile, traditional methods cannot accurately describe the correlations between the anchor and the irrelevant data. This paper aims to solve the abovementioned problems by proposing to fuse coarse-grained and fine-grained features and a multi-margin triplet loss based on a dual-framework. (1) Framework I: A multi-grained data fusion framework based on Deep Belief Network, and (2) Framework II: A multi-modality data fusion framework based on the multi-margin triplet loss function. In Framework I, the coarse-grained and fine-grained features fused by the joint Restricted Boltzmann Machine are input into Framework II. In Framework II, we innovatively propose the multi-margin triplet loss. The data, which belong to different modalities and semantic categories, are stepped away from the anchor in a multi-margin way. Experimental results show that the proposed method achieves better cross-media retrieval performance than other methods with different datasets. Furthermore, the ablation experiments verify that our proposed multi-grained fusion strategy and the multi-margin triplet loss function are effective. (c) 2023 Sharif University of Technology. All rights reserved.
引用
收藏
页码:1645 / 1669
页数:25
相关论文
共 50 条
  • [41] Multi-Modality Sensing and Data Fusion for Multi-Vehicle Detection
    Roy, Debashri
    Li, Yuanyuan
    Jian, Tong
    Tian, Peng
    Chowdhury, Kaushik
    Ioannidis, Stratis
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2280 - 2295
  • [42] Cross-media retrieval of scientific and technological information based on multi-feature fusion
    Jiang, Yang
    Du, Junping
    Xue, Zhe
    Li, Ang
    NEUROCOMPUTING, 2022, 509 : 85 - 93
  • [43] A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval
    Lin, Wanxia
    Lu, Tong
    Su, Feng
    ADVANCES IN MULTIMEDIA MODELING, 2012, 7131 : 740 - 749
  • [44] MINIMUM SPANNING TREE FUSING MULTI-SALIENT POINTS HIERARCHICALLY FOR MULTI-MODALITY IMAGE REGISTRATION
    Zhang, Shaomin
    Zhi, Lijia
    Zhao, Dazhe
    Zhao, Hong
    VISAPP 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2010, : 33 - 36
  • [45] Joint graph regularization based modality-dependent cross-media retrieval
    Jihong Yan
    Huaxiang Zhang
    Jiande Sun
    Qiang Wang
    Peilian Guo
    Lili Meng
    Wenbo Wan
    Xiao Dong
    Multimedia Tools and Applications, 2018, 77 : 3009 - 3027
  • [46] MARS: Learning Modality-Agnostic Representation for Scalable Cross-Media Retrieval
    Wang, Yunbo
    Peng, Yuxin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4765 - 4777
  • [47] Towards Decrypting Attractiveness via Multi-Modality Cues
    Nguyen, Tam V.
    Liu, Si
    Ni, Bingbing
    Tan, Jun
    Rui, Yong
    Yan, Shuicheng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2013, 9 (04)
  • [48] A Multi-grained Log Auditing Scheme for Cloud Data Confidentiality
    Yang, Zhen
    Wang, Wenyu
    Huang, Yongfeng
    Li, Xing
    MOBILE NETWORKS & APPLICATIONS, 2021, 26 (02): : 842 - 850
  • [49] Joint graph regularization based modality-dependent cross-media retrieval
    Yan, Jihong
    Zhang, Huaxiang
    Sun, Jiande
    Wang, Qiang
    Guo, Peilian
    Meng, Lili
    Wan, Wenbo
    Dong, Xiao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (03) : 3009 - 3027
  • [50] Human Action Recognition Via Multi-modality Information
    Gao, Zan
    Song, Jian-ming
    Zhang, Hua
    Liu, An-An
    Xue, Yan-Bing
    Xu, Guang-ping
    JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2014, 9 (02) : 739 - 748