Visual-Semantic Aligned Bidirectional Network for Zero-Shot Learning

被引:9
|
作者
Gao, Rui [2 ]
Hou, Xingsong [2 ]
Qin, Jie [1 ]
Shen, Yuming [3 ]
Long, Yang [4 ]
Liu, Li [5 ]
Zhang, Zhao [6 ]
Shao, Ling [5 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 211106, Peoples R China
[2] Xi An Jiao Tong Univ, Dept Elect & Informat Engn, Xian 710049, Peoples R China
[3] Univ Oxford, Dept Engn Sci, Oxford OX1 3PJ, England
[4] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
[5] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[6] Hefei Univ Technol, Sch Comp & Informat, Hefei 230601, Peoples R China
关键词
Bidirectional network; generative model; zero-shot learning; ADVERSARIAL NETWORK; KERNEL;
D O I
10.1109/TMM.2022.3145666
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-shot learning (ZSL) aims to recognize unknown categories that are unavailable during training. Recently, generative models have shown the potential to address this challenging problem by synthesizing unseen features conditioned on semantic embeddings such as attributes. However, unidirectional generative models cannot guarantee the effective coupling between visual and semantic spaces. To this end, we propose a visual-semantic aligned bidirectional network with cycle consistency to alleviate the gap between these two spaces, generating unseen features of high quality. More importantly, we incorporate two carefully designed strategies into our bidirectional framework to improve the overall ZSL performance. Specifically, we enhance the intra-domain class divergence in both visual and semantic spaces, and in the meantime, mitigate the inter-domain shift to preserve seen-unseen domain discrimination. Experimental results on four standard benchmarks show the superiority of our framework over existing state-of-the-art methods under both conventional and generalized ZSL settings.
引用
收藏
页码:1649 / 1664
页数:16
相关论文
共 50 条
  • [1] Zero-shot learning via visual-semantic aligned autoencoder
    Wei, Tianshu
    Huang, Jinjie
    Jin, Cong
    [J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (08) : 14081 - 14095
  • [2] Visual-semantic consistency matching network for generalized zero-shot learning
    Zhang, Zhenqi
    Cao, Wenming
    [J]. NEUROCOMPUTING, 2023, 536 : 30 - 39
  • [3] Transductive Visual-Semantic Embedding for Zero-shot Learning
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shao, Jie
    Huang, Zi
    [J]. PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 41 - 49
  • [4] Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings
    Shen, Fumin
    Zhou, Xiang
    Yu, Jun
    Yang, Yang
    Liu, Li
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (07) : 3662 - 3674
  • [5] Spatiotemporal visual-semantic embedding network for zero-shot action recognition
    An, Rongqiao
    Miao, Zhenjiang
    Li, Qingyu
    Xu, Wanru
    Zhang, Qiang
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [6] Zero-shot learning with visual-semantic mutual reinforcement for image recognition
    Zhang, Yuhong
    Chen, Taohong
    Yu, Kui
    Hu, Xuegang
    [J]. Journal of Electronic Imaging, 2024, 33 (05)
  • [7] Deep quantization network with visual-semantic alignment for zero-shot image retrieval
    Liu, Huixia
    Qin, Zhihong
    [J]. ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (07): : 4232 - 4247
  • [8] Improved Visual-Semantic Alignment for Zero-Shot Object Detection
    Rahman, Shafin
    Khan, Salman
    Barnes, Nick
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11932 - 11939
  • [9] Indirect visual-semantic alignment for generalized zero-shot recognition
    Chen, Yan-He
    Yeh, Mei-Chen
    [J]. MULTIMEDIA SYSTEMS, 2024, 30 (02)
  • [10] Graph-Based Visual-Semantic Entanglement Network for Zero-Shot Image Recognition
    Hu, Yang
    Wen, Guihua
    Chapman, Adriane
    Yang, Pei
    Luo, Mingnan
    Xu, Yingxue
    Dai, Dan
    Hall, Wendy
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2473 - 2487