A Cross-Modal Alignment for Zero-Shot Image Classification

被引：5

作者：

Wu, Lu ^{[1
,2
]}

Wu, Chenyu ^{[2
]}

Guo, Han ^{[3
]}

Zhao, Zhihao ^{[2
]}

机构：

[1] Minist Nat Resources, Key Lab Urban Land Resources Monitoring & Simulat, Shenzhen 518000, Peoples R China

[2] Wuhan Univ Technol, Sch Informat Engn, Wuhan 430070, Peoples R China

[3] Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Visualization; Semantics; Training data; Feature extraction; Object recognition; Monitoring; Image classification; Cross-modal alignment; zero-shot image classification; text attribute query; cosine similarity;

D O I：

10.1109/ACCESS.2023.3237966

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Different from major classification methods based on large amounts of annotation data, we introduce a cross-modal alignment for zero-shot image classification.The key is utilizing the query of text attribute learned from the seen classes to guide local feature responses in unseen classes. First, an encoder is used to align semantic matching between visual features and their corresponding text attribute. Second, an attention module is used to get response maps through feature maps activated by the query of text attribute. Finally, the cosine distance metric is used to measure the matching degree of the text attribute and its corresponding feature response. The experiment results show that the method get better performance than existing Zero-shot Learning in embedding-based methods as well as other generative methods in CUB-200-2011 dataset.

引用

页码：9067 / 9073

页数：7

共 50 条

[21] Discrete asymmetric zero-shot hashing with application to cross-modal retrieval
Shu, Zhenqiu
Yong, Kailing
Yu, Jun
Gao, Shengxiang
Mao, Cunli
Yu, Zhengtao
NEUROCOMPUTING, 2022, 511 : 366 - 379
[22] Attribute-Guided Network for Cross-Modal Zero-Shot Hashing
Ji, Zhong
Sun, Yuxin
Yu, Yunlong
Pang, Yanwei
Han, Jungong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (01) : 321 - 330
[23] Multimodal Disentanglement Variational AutoEncoders for Zero-Shot Cross-Modal Retrieval
Tian, Jialin
Wang, Kai
Xu, Xing
Cao, Zuo
Shen, Fumin
Shen, Heng Tao
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 960 - 969
[24] Cross-modal prototype learning for zero-shot handwritten character recognition
Ao, Xiang
Zhang, Xu-Yao
Liu, Cheng-Lin
PATTERN RECOGNITION, 2022, 131
[25] Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval
Jiao, Shichao
Han, Xie
Xiong, Fengguang
Yang, Xiaowen
Han, Huiyan
He, Ligang
Kuang, Liqun
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (16): : 13469 - 13483
[26] Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval
Lin, Kaiyi
Xu, Xing
Gao, Lianli
Wang, Zheng
Shen, Heng Tao
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11515 - 11522
[27] Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval
Shichao Jiao
Xie Han
Fengguang Xiong
Xiaowen Yang
Huiyan Han
Ligang He
Liqun Kuang
Neural Computing and Applications, 2022, 34 : 13469 - 13483
[28] Zero-shot discrete hashing with adaptive class correlation for cross-modal retrieval
Yong, Kailing
Shu, Zhenqiu
Yu, Jun
Yu, Zhengtao
KNOWLEDGE-BASED SYSTEMS, 2024, 295
[29] Zero-Shot Cross-Modal Retrieval for Remote Sensing Images With Minimal Supervision
Chaudhuri, Ushasi
Bose, Rupak
Banerjee, Biplab
Bhattacharya, Avik
Datcu, Mihai
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[30] Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Duquenne, Paul-Ambroise
Schwenk, Holger
Sagot, Benoit
INTERSPEECH 2023, 2023, : 32 - 36

← 1 2 3 4 5 →