Saliency Inside: Learning Attentive CNNs for Content-Based Image Retrieval

被引:50
|
作者
Wei, Shikui [1 ,2 ]
Liao, Lixin [1 ,2 ]
Li, Jia [3 ,4 ,5 ]
Zheng, Qinjie [1 ,2 ]
Yang, Fei [1 ,2 ]
Zhao, Yao [1 ,2 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China
[2] Beijing Key Lab Adv Informat Sci & Network Techno, Beijing 100044, Peoples R China
[3] Beihang Univ, Sch Comp Sci & Engn, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
[4] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
[5] Peng Cheng Lab, Shenzhen 518000, Peoples R China
基金
中国国家自然科学基金;
关键词
Visual saliency; content-based image retrieval; bag-of-word; convolutional neural networks; END;
D O I
10.1109/TIP.2019.2913513
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In content-based image retrieval (CBIR), one of the most challenging and ambiguous tasks is to correctly understand the human query intention and measure its semantic relevance with images in the database. Due to the impressive capability of visual saliency in predicting human visual attention that is closely related to the query intention, this paper attempts to explicitly discover the essential effect of visual saliency in CBIR via qualitative and quantitative experiments. Toward this end, we first generate the fixation density maps of images from a widely used CBIR dataset by using an eye-tracking apparatus. These ground-truth saliency maps are then used to measure the influence of visual saliency to the task of CBIR by exploring several probable ways of incorporating such saliency cues into the retrieval process. We find that visual saliency is indeed beneficial to the CBIR task, and the best saliency involving scheme is possibly different for different image retrieval models. Inspired by the findings, this paper presents two-stream attentive convolutional neural networks (CNNs) with saliency embedded inside for CBIR. The proposed network has two streams that simultaneously handle two tasks. The main stream focuses on extracting discriminative visual features that are tightly related to semantic attributes. Meanwhile, the auxiliary stream aims to facilitate the main stream by redirecting the feature extraction to the salient image content that a human may pay attention to. By fusing these two streams into the Main and Auxiliary CNNs (MAC), image similarity can be computed as the human being does by reserving conspicuous content and suppressing irrelevant regions. Extensive experiments show that the proposed model achieves impressive performance in image retrieval on four public datasets.
引用
收藏
页码:4580 / 4593
页数:14
相关论文
共 50 条
  • [1] Is visual saliency useful for content-based image retrieval?
    Yanzhang Wu
    Hongzhe Liu
    Jiazheng Yuan
    Qikun Zhang
    [J]. Multimedia Tools and Applications, 2018, 77 : 13983 - 14006
  • [2] Is visual saliency useful for content-based image retrieval?
    Wu, Yanzhang
    Liu, Hongzhe
    Yuan, Jiazheng
    Zhang, Qikun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (11) : 13983 - 14006
  • [3] Combining graph learning and region saliency analysis for content-based image retrieval
    School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
    不详
    [J]. Tien Tzu Hsueh Pao, 10 (2288-2294):
  • [4] Compact Root Bilinear CNNs for Content-Based Image Retrieval
    Alzu'bi, Ahmad
    Amira, Abbes
    Ramzan, Naeem
    [J]. 2016 INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2016), 2016, : 41 - 45
  • [5] Learning in content-based image retrieval
    Huang, TS
    Zhou, XS
    Nakazato, M
    Wu, Y
    Cohen, I
    [J]. 2ND INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, PROCEEDINGS, 2002, : 155 - 162
  • [6] Localized Content-based Image Retrieval Using Saliency-based Graph Learning Framework
    Feng, Songhe
    Lang, Congyan
    Xu, De
    [J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 1029 - 1032
  • [7] A Visual Saliency-Based Approach for Content-Based Image Retrieval
    Khan, Aamir
    Jalal, Anand Singh
    [J]. INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2021, 15 (01) : 1 - 15
  • [8] Adversarial learning for Content-based Image Retrieval
    Huang, Ling
    Bai, Cong
    Lu, Yijuan
    Chen, Shengyong
    Tian, Qi
    [J]. 2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 97 - 102
  • [9] Distance learning and content-based image retrieval
    Zhang, YJ
    Liu, ZW
    Yao, YR
    [J]. PROCEEDINGS OF ICCE'98, VOL 2 - GLOBAL EDUCATION ON THE NET, 1998, : 429 - 433
  • [10] Online Learning to Rank for Content-Based Image Retrieval
    Wan, Ji
    Wu, Pengcheng
    Hoi, Steven C. H.
    Zhao, Peilin
    Gao, Xingyu
    Wang, Dayong
    Zhang, Yongdong
    Li, Jintao
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 2284 - 2290