A Simplified Framework for Zero-shot Cross-Modal Sketch Data Retrieval

被引:7
|
作者
Chaudhuri, Ushasi [1 ]
Banerjee, Biplab [1 ]
Bhattacharya, Avik [1 ]
Datcu, Mihai [2 ]
机构
[1] Indian Inst Technol, Mumbai, Maharashtra, India
[2] German Aerosp Ctr DLR, Cologne, Germany
关键词
D O I
10.1109/CVPRW50498.2020.00099
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We deal with the problem of zero-shot cross-modal image retrieval involving color and sketch images through a novel deep representation learning technique. The problem of a sketch to image retrieval and vice-versa is of practical importance, and a trained model in this respect is expected to generalize beyond the training classes, e.g., the zero-shot learning scenario. Nonetheless, considering the drastic distributions-gap between both the modalities, a feature alignment is necessary to learn a shared feature space where retrieval can efficiently be carried out. Additionally, it should also be guaranteed that the shared space is semantically meaningful to aid in the zero-shot retrieval task. The very few existing techniques for zero-shot sketch-RGB image retrieval extend the deep generative models for learning the embedding space; however, training a typical GAN like model for multi-modal image data may be non-trivial at times. To this end, we propose a multi-stream encoder-decoder model that simultaneously ensures improved mapping between the RGB and sketch image spaces and high discrimination in the shared semantics-driven encoded feature space. Further, it is guaranteed that the class topology of the original semantic space is preserved in the encoded feature space, which subsequently reduces the model bias towards the training classes. Experimental results obtained on the benchmark Sketchy and TU-Berlin datasets establish the efficacy of our model as we outperform the existing state-of-the-art techniques by a considerable margin.
引用
收藏
页码:699 / 706
页数:8
相关论文
共 50 条
  • [1] Generalized Zero-Shot Cross-Modal Retrieval
    Dutta, Titir
    Biswas, Soma
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (12) : 5953 - 5962
  • [2] Cross-modal Self-distillation for Zero-shot Sketch-based Image Retrieval
    Tian, Jia-Lin
    Xu, Xing
    Shen, Fu-Min
    Shen, Heng-Tao
    [J]. Ruan Jian Xue Bao/Journal of Software, 2022, 33 (09):
  • [3] Progressive Cross-Modal Semantic Network for Zero-Shot Sketch-Based Image Retrieval
    Deng, Cheng
    Xu, Xinxun
    Wang, Hao
    Yang, Muli
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8892 - 8902
  • [4] Cross-modal Zero-shot Hashing
    Liu, Xuanwu
    Li, Zhao
    Wang, Jun
    Yu, Guoxian
    Domeniconi, Carlotta
    Zhang, Xiangliang
    [J]. 2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 449 - 458
  • [5] Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval
    Jiao, Shichao
    Han, Xie
    Xiong, Fengguang
    Yang, Xiaowen
    Han, Huiyan
    He, Ligang
    Kuang, Liqun
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (16): : 13469 - 13483
  • [6] Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval
    Shichao Jiao
    Xie Han
    Fengguang Xiong
    Xiaowen Yang
    Huiyan Han
    Ligang He
    Liqun Kuang
    [J]. Neural Computing and Applications, 2022, 34 : 13469 - 13483
  • [7] CHOP: An orthogonal hashing method for zero-shot cross-modal retrieval
    Yuan, Xu
    Wang, Guangze
    Chen, Zhikui
    Zhong, Fangming
    [J]. PATTERN RECOGNITION LETTERS, 2021, 145 : 247 - 253
  • [8] Multimodal Disentanglement Variational AutoEncoders for Zero-Shot Cross-Modal Retrieval
    Tian, Jialin
    Wang, Kai
    Xu, Xing
    Cao, Zuo
    Shen, Fumin
    Shen, Heng Tao
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 960 - 969
  • [9] Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval
    Xu, Xing
    Lin, Kaiyi
    Lu, Huimin
    Gao, Lianli
    Shen, Heng Tao
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1419 - 1428
  • [10] Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval
    Yang, Fan
    Wang, Zheng
    Xiao, Jing
    Satoh, Shin'chi
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12589 - 12596