Sketch-based Image Retrieval Using Cross-domain Modeling and Deep Fusion Network

被引:0
|
作者
Yu D. [1 ]
Liu Y.-J. [1 ]
Xing M.-M. [1 ]
Li Z.-M. [1 ]
Li H. [2 ]
机构
[1] College of Computer and Communication Engineering, China University of Petroleum (East China), Qingdao
[2] Institute of Computing Technology, Chinese Academy of Sciences, Beijing
来源
Ruan Jian Xue Bao/Journal of Software | 2019年 / 30卷 / 11期
基金
中国国家自然科学基金;
关键词
Crossing-domain modeling; Deep learning; Feature fusion; Multi-layer deep fusion convolutional neural network; Sketch-based image retrieval (SBIR);
D O I
10.13328/j.cnki.jos.005570
中图分类号
学科分类号
摘要
The purpose of this paper is to introduce a new approach for the free-hand sketch representation in the sketch-based image retrieval (SBIR), where the sketches are treated as the queries to search for the natural photos in the natural image dataset. This task is known as an extremely challenging work for 3 main reasons: (1) Sketches show a highly abstract visual appearance versus natural photos, fewer context can be extracted as descriptors using the existing methods. (2) For the same object, different people provide widely different sketches, making sketch-photo matching harder. (3) Mapping the sketches and photos into a common domain is also a challenging task. In this study, the cross-domain question is addressed using a strategy of mapping sketches and natural photos in multiple layers. For the first time, a multi-layer deep CNN framework is introduced to train the multi-layer representation of free hand sketches and natural photos. Flickr15k dataset is used as the benchmark for the retrieval and it is shown that the learned representation significantly outperforms both hand-crafted features as well as deep features trained by sketches or photos. © Copyright 2019, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:3567 / 3577
页数:10
相关论文
共 23 条
  • [1] Eitz M., Hays J., Alexa M., How do humans sketch objects?, ACM Trans. On Graph, 31, 4, pp. 44:1-44:10, (2012)
  • [2] Fu H., Zhou S., Liu L., Et al., Animated Construction of Line Drawings, ACM Trans. on Graphics, 30, 6, pp. 1-10, (2011)
  • [3] Yu Q., Yang Y., Song Y.Z., Et al., Sketch-a-Net that beats humans, (2015)
  • [4] Sangkloy P., Burnell N., Ham C., Et al., The sketchy database: Learning to retrieve badly drawn bunnies, ACM Trans. on Graphics (TOG), 35, 4, (2016)
  • [5] Lim J.J., Zitnick C.L., Dollar P., Sketch tokens: A learned mid-level representation for contour and object detection, Proc. of the 2013 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3158-3165, (2013)
  • [6] Arbelaez P., Maire M., Fowlkes C., Et al., Contour detection and hierarchical image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, 33, 5, pp. 898-916, (2011)
  • [7] Yu Q., Liu F., Song Y.Z., Et al., Sketch me that shoe, Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 799-807, (2016)
  • [8] Su H., Maji S., Kalogerakis E., Et al., Multi-View convolutional neural networks for 3D shape recognition, Proc. of the 2015 IEEE Int'l Conf. on Computer Vision, pp. 945-953, (2015)
  • [9] Lowe D.G., Distinctive image features from scale-invariant keypoints, Int'l Journal of Computer Vision, 60, 2, pp. 91-110, (2004)
  • [10] Mori G., Belongie S., Malik J., Efficient shape matching using shape contexts, IEEE Trans. on Pattern Analysis and Machine Intelligence, 27, 11, pp. 1832-1837, (2005)