Cross-media similarity metric learning with unified deep networks

被引:0
|
作者
Jinwei Qi
Xin Huang
Yuxin Peng
机构
[1] Peking University,Institute of Computer Science and Technology
来源
Multimedia Tools and Applications | 2017年 / 76卷
关键词
Cross-media retrieval; Representation learning; Metric learning;
D O I
暂无
中图分类号
学科分类号
摘要
As a highlighting research topic in the multimedia area, cross-media retrieval aims to capture the complex correlations among multiple media types. Learning better shared representation and distance metric for multimedia data is important to boost the cross-media retrieval. Motivated by the strong ability of deep neural network in feature representation and comparison functions learning, we propose the Unified Network for Cross-media Similarity Metric (UNCSM) to associate cross-media shared representation learning with distance metric in a unified framework. First, we design a two-pathway deep network pretrained with contrastive loss, and employ double triplet similarity loss for fine-tuning to learn the shared representation for each media type by modeling the relative semantic similarity. Second, the metric network is designed for effectively calculating the cross-media similarity of the shared representation, by modeling the pairwise similar and dissimilar constraints. Compared to the existing methods which mostly ignore the dissimilar constraints and only use sample distance metric as Euclidean distance separately, our UNCSM approach unifies the representation learning and distance metric to preserve the relative similarity as well as embrace more complex similarity functions for further improving the cross-media retrieval accuracy. The experimental results show that our UNCSM approach outperforms 8 state-of-the-art methods on 4 widely-used cross-media datasets.
引用
收藏
页码:25109 / 25127
页数:18
相关论文
共 50 条
  • [21] Cross-Media Image-Text Retrieval Combined with Global Similarity and Local Similarity
    Li, Zhixin
    Ling, Feng
    Zhang, Canlong
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 145 - 153
  • [22] A cross-media distance metric learning framework based on multi-view correlation mining and matching
    Hong Zhang
    Xingyu Gao
    Ping Wu
    Xin Xu
    World Wide Web, 2016, 19 : 181 - 197
  • [23] Positional analysis in cross-media information diffusion networks
    Hecking, Tobias
    Steinert, Laura
    Masias, Victor H.
    Hoppe, H. Ulrich
    APPLIED NETWORK SCIENCE, 2019, 4 (01) : 1 - 18
  • [24] A cross-media distance metric learning framework based on multi-view correlation mining and matching
    Zhang, Hong
    Gao, Xingyu
    Wu, Ping
    Xu, Xin
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (02): : 181 - 197
  • [25] Deep Discrete Cross-Modal Hashing for Cross-Media Retrieval
    Zhong, Fangming
    Chen, Zhikui
    Min, Geyong
    PATTERN RECOGNITION, 2018, 83 : 64 - 77
  • [26] Positional analysis in cross-media information diffusion networks
    Tobias Hecking
    Laura Steinert
    Victor H. Masias
    H. Ulrich Hoppe
    Applied Network Science, 4
  • [27] Relational Patterns in Cross-Media Information Diffusion Networks
    Hecking, Tobias
    Steinert, Laura
    Masias, Victor H.
    Hoppe, H. Ulrich
    COMPLEX NETWORKS & THEIR APPLICATIONS VI, 2018, 689 : 1002 - 1014
  • [28] Effective Heterogeneous Similarity Measure with Nearest Neighbors for Cross-Media Retrieval
    Zhai, Xiaohua
    Peng, Yuxin
    Xiao, Jianguo
    ADVANCES IN MULTIMEDIA MODELING, 2012, 7131 : 312 - 322
  • [29] LEARNING OPTIMAL DATA REPRESENTATION FOR CROSS-MEDIA RETRIEVAL
    Zhang, Hong
    Chen, Li
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 1925 - 1928
  • [30] Cross-Media Image-Text Retrieval with Two Level Similarity
    Li Z.-X.
    Ling F.
    Zhang C.-L.
    Ma H.-F.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (02): : 268 - 274