Deep Top-k Ranking for Image-Sentence Matching

被引:30
|
作者
Zhang, Lingling [1 ]
Luo, Minnan [2 ]
Liu, Jun [2 ]
Chang, Xiaojun [3 ]
Yang, Yi [4 ]
Hauptmann, Alexander G. [5 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Key Lab Intelligent Networks & Network Secur, Minist Educ, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, Natl Engn Lab Big Data Analyt, Xian 710049, Peoples R China
[3] Monash Univ, Fac Informat Technol, Clayton Vic 3800, Australia
[4] Univ Technol Sydney, Ctr Quantum Computat & Intelligent Syst, Ultimo, NSW 2007, Australia
[5] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Image-sentence matching; cross-modal retrieval; deep learning; top-k ranking; FUSION;
D O I
10.1109/TMM.2019.2931352
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image-sentence matching is a challenging task for the heterogeneity-gap between different modalities. Ranking-based methods have achieved excellent performance in this task in past decades. Given an image query, these methods typically assume that the correct matched image-sentence pair must rank before all other mismatched ones. However, this assumption may be too strict and prone to the overfitting problem, especially when some sentences in a massive database are similar and confusable with one another. In this paper, we relax the traditional ranking loss and propose a novel deep multi-modal network with a top-k ranking loss to mitigate the data ambiguity problem. With this strategy, query results will not be penalized unless the index of ground truth is outside the range of top-k query results. Considering the non-smoothness and non-convexity of the initial top-k ranking loss, we exploit a tight convex upper bound to approximate the loss and then utilize the traditional back-propagation algorithm to optimize the deep multi-modal network. Finally, we apply the method on three benchmark datasets, namely, Flickr8k, Flickr30k, and MSCOCO. Empirical results on metrics R@K (K = 1, 5, 10) show that our method achieves comparable performance in comparison to state-of-the-art methods.
引用
收藏
页码:775 / 785
页数:11
相关论文
共 50 条
  • [41] Expressive top-k matching for conditional graph patterns
    Houari Mahfoud
    [J]. Neural Computing and Applications, 2022, 34 : 14205 - 14221
  • [42] Distributed Top-k Subgraph Matching in A Big Graph
    Gao, Jianliang
    Lei, Chuqi
    Tian, Ling
    Ling, Yuan
    Chen, Zheng
    Song, Bo
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5325 - 5327
  • [43] Fast Algorithms for Top-k Approximate String Matching
    Yang, Zhenglu
    Yu, Jianjun
    Kitsuregawa, Masaru
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 1467 - 1473
  • [44] Approximating Diversified Top-k Graph Pattern Matching
    Wang, Xin
    Zhan, Huayi
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2018, PT I, 2018, 11029 : 407 - 423
  • [45] Accelerating Top-k ListNet Training for Ranking Using FPGA
    Li, Qiang
    Fleming, Shane T.
    Thomas, David B.
    Cheung, Peter Y. K.
    [J]. 2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 245 - 248
  • [46] Optimal Instance Adaptive Algorithm for the Top-K Ranking Problem
    Chen, Xi
    Gopi, Sivakanth
    Mao, Jieming
    Schneider, Jon
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2018, 64 (09) : 6139 - 6160
  • [47] Top-k Similarity Matching in Large Graphs with Attributes
    Ding, Xiaofeng
    Jia, Jianhong
    Li, Jiuyong
    Liu, Jixue
    Jin, Hai
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT II, 2014, 8422 : 156 - 170
  • [48] A Rating-Ranking Method for Crowdsourced Top-k Computation
    Li, Kaiyu
    Zhang, Xiaohang
    Li, Guoliang
    [J]. SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 975 - 990
  • [49] An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information
    Li, Zejun
    Wei, Zhongyu
    Fan, Zhihao
    Shan, Haijun
    Huang, Xuanjing
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13324 - 13332
  • [50] Top-K Query Retrieval of Combinations with Sum-of-Subsets Ranking
    Majumder, Subhashis
    Sanyal, Biswajit
    Gupta, Prosenjit
    Sinha, Soumik
    Pande, Shiladitya
    Hon, Wing-Kai
    [J]. COMBINATORIAL OPTIMIZATION AND APPLICATIONS (COCOA 2014), 2014, 8881 : 490 - 505