Language-Augmented Pixel Embedding for Generalized Zero-Shot Learning

被引：11

作者：

Wang, Ziyang ^{[1
,2
]}

Gou, Yunhao ^{[1
,2
]}

Li, Jingjing ^{[2
]}

Zhu, Lei ^{[3
]}

Shen, Heng Tao ^{[3
]}

机构：

[1] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Huzhou, Huzhou 313002, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China

[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2023年 / 33卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Semantics; Visualization; Task analysis; Feature extraction; Image recognition; Annotations; Knowledge transfer; Zero-shot learning; transfer learning; attention mechanism;

D O I：

10.1109/TCSVT.2022.3208256

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Zero-shot Learning (ZSL) aims to recognize novel classes through seen knowledge. The canonical approach to ZSL leverages a visual-to-semantic embedding to map the global features of an image sample to its semantic representation. These global features usually overlook the fine-grained information which is vital for knowledge transfer between seen and unseen classes, rendering these features sub-optimal for ZSL task, especially the more realistic Generalized Zero-shot Learning (GZSL) task where global features of similar classes could hardly be separated. To provide a remedy to this problem, we propose Language-Augmented Pixel Embedding (LAPE) that directly bridges the visual and semantic spaces in a pixel-based manner. To this end, we map the local features of each pixel to different attributes and then extract each semantic attribute from the corresponding pixel. However, the lack of pixel-level annotation conduces to an inefficient pixel-based knowledge transfer. To mitigate this dilemma, we adopt the text information of each attribute to augment the local features of image pixels which are related to the semantic attributes. Experiments on four ZSL benchmarks demonstrate that LAPE outperforms current state-of-the-art methods. Comprehensive ablation studies and analyses are provided to dissect what factors lead to this success.

引用

页码：1019 / 1030

页数：12

共 50 条

[21] A Review of Generalized Zero-Shot Learning Methods
Pourpanah, Farhad
Abdar, Moloud
Luo, Yuxuan
Zhou, Xinlei
Wang, Ran
Lim, Chee Peng
Wang, Xi-Zhao
Wu, Q. M. Jonathan
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4051 - 4070
[22] Cross-modal distribution alignment embedding network for generalized zero-shot learning
Li, Qin
Hou, Mingzhen
Lai, Hong
Yang, Ming
NEURAL NETWORKS, 2022, 148 : 176 - 182
[23] Hyperbolic Visual Embedding Learning for Zero-Shot Recognition
Liu, Shaoteng
Chen, Jingjing
Pan, Liangming
Ngo, Chong-Wah
Chua, Tat-Seng
Jiang, Yu-Gang
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 9270 - 9278
[24] Incremental Embedding Learning via Zero-Shot Translation
Wei, Kun
Deng, Cheng
Yang, Xu
Li, Maosen
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10254 - 10262
[25] Attributes learning network for generalized zero-shot learning
Yun, Yu
Wang, Sen
Hou, Mingzhen
Gao, Quanxue
NEURAL NETWORKS, 2022, 150 : 112 - 118
[26] Generalized Zero-Shot Recognition based on Visually Semantic Embedding
Zhu, Pengkai
Wang, Hanxiao
Saligrama, Venkatesh
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2990 - 2998
[27] ENCYCLOPEDIA ENHANCED SEMANTIC EMBEDDING FOR ZERO-SHOT LEARNING
Jia, Zhen
Zhang, Junge
Huang, Kaiqi
Tan, Tieniu
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1287 - 1291
[28] Transductive Zero-Shot Learning With Adaptive Structural Embedding
Yu, Yunlong
Ji, Zhong
Guo, Jichang
Pang, Yanwei
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) : 4116 - 4127
[29] Zero-Shot Learning via Semantic Similarity Embedding
Zhang, Ziming
Saligrama, Venkatesh
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4166 - 4174
[30] Deep Unbiased Embedding Transfer for Zero-Shot Learning
Jia, Zhen
Zhang, Zhang
Wang, Liang
Shan, Caifeng
Tan, Tieniu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1958 - 1971

← 1 2 3 4 5 →