ADAPTIVE MULTI-SCALE SEMANTIC FUSION NETWORK FOR ZERO-SHOT LEARNING

被引：0

作者：

Song, Jing ^{[1
]}

Peng, Peixi ^{[2
]}

Zhai, Yunpeng ^{[1
]}

Zhang, Chong ^{[1
]}

Tian, Yonghong ^{[2
]}

机构：

[1] Peking Univ, Shenzhen Grad Sch, Shenzhen, Peoples R China

[2] Peking Univ, Beijing, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW) | 2021年

关键词：

Multi-scale; attribute attention; Semantic fusion; global and local semantic attributes; class-center triplet loss;

D O I：

10.1109/ICMEW53276.2021.9455945

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Zero-shot learning aims at accurately recognizing unseen objects by learning matrices that bridge the gap between visual information and semantic attributes. Existing approaches predominantly focus on learning the proper mapping function for visual-semantic embedding while neglecting the effect of learning discriminative semantic features, which leads to severe semantic ambiguity. We propose a practical Adaptive Multi-scale Semantic Fusion (AMSF) framework to perform object-based multi-scale attribute attention for semantic disambiguation. Considering both low-level visual information and global class-level features that relate to this ambiguity, the proposed method jointly learns cooperative global and local semantic attributes from different scales. Moreover, with the joint supervision of embedding softmax loss and class-center triplet loss, the model is encouraged to learn high discriminative semantic features and visual features with high interclass dispersion and infra-class compactness. The method is evaluated on CUB, AwA2, and SUN datasets, and the experimental results indicate the method achieves state-of-the-art performance.

引用

页数：6

共 50 条

[1] Exploiting multi-scale contextual prompt learning for zero-shot semantic segmentation☆
Wang, Yiqi
Tian, Yingjie
[J]. DISPLAYS, 2024, 81
[2] Multi-scale visual attention for attribute disambiguation in zero-shot learning
Tian, Long
Chen, Bo
Ren, Jie
Zhang, Hao
Wu, Zhenhua
Han, Ning
Chen, Yuanwei
Liu, Hongwei
[J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 103
[3] Attentive Semantic Preservation Network for Zero-Shot Learning
Lu, Ziqian
Yu, Yunlong
Lu, Zhe-Ming
Shen, Feng-Li
Zhang, Zhongfei
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2919 - 2925
[4] Semantic Consistent Embedding for Domain Adaptive Zero-Shot Learning
Zhang, Jianyang
Yang, Guowu
Hu, Ping
Lin, Guosheng
Lv, Fengmao
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4024 - 4035
[5] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
Chen, Shiming
Hong, Ziming
Xie, Guo-Sen
Yang, Wenhan
Peng, Qinmu
Wang, Kai
Zhao, Jian
You, Xinge
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7602 - 7611
[6] Semantic Autoencoder for Zero-Shot Learning
Kodirov, Elyor
Xiang, Tao
Gong, Shaogang
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4447 - 4456
[7] Learning semantic ambiguities for zero-shot learning
Hanouti, Celina
Le Borgne, Herve
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40745 - 40759
[8] Learning semantic ambiguities for zero-shot learning
Celina Hanouti
Hervé Le Borgne
[J]. Multimedia Tools and Applications, 2023, 82 : 40745 - 40759
[9] Multi-Scale Speaker Vectors for Zero-Shot Speech Synthesis
Cory, Tristin
Iqbal, Razib
[J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 496 - 501
[10] Visual-Semantic Aligned Bidirectional Network for Zero-Shot Learning
Gao, Rui
Hou, Xingsong
Qin, Jie
Shen, Yuming
Long, Yang
Liu, Li
Zhang, Zhao
Shao, Ling
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1649 - 1664

← 1 2 3 4 5 →