ADAPTIVE MULTI-SCALE SEMANTIC FUSION NETWORK FOR ZERO-SHOT LEARNING

被引:0
|
作者
Song, Jing [1 ]
Peng, Peixi [2 ]
Zhai, Yunpeng [1 ]
Zhang, Chong [1 ]
Tian, Yonghong [2 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Shenzhen, Peoples R China
[2] Peking Univ, Beijing, Peoples R China
关键词
Multi-scale; attribute attention; Semantic fusion; global and local semantic attributes; class-center triplet loss;
D O I
10.1109/ICMEW53276.2021.9455945
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Zero-shot learning aims at accurately recognizing unseen objects by learning matrices that bridge the gap between visual information and semantic attributes. Existing approaches predominantly focus on learning the proper mapping function for visual-semantic embedding while neglecting the effect of learning discriminative semantic features, which leads to severe semantic ambiguity. We propose a practical Adaptive Multi-scale Semantic Fusion (AMSF) framework to perform object-based multi-scale attribute attention for semantic disambiguation. Considering both low-level visual information and global class-level features that relate to this ambiguity, the proposed method jointly learns cooperative global and local semantic attributes from different scales. Moreover, with the joint supervision of embedding softmax loss and class-center triplet loss, the model is encouraged to learn high discriminative semantic features and visual features with high interclass dispersion and infra-class compactness. The method is evaluated on CUB, AwA2, and SUN datasets, and the experimental results indicate the method achieves state-of-the-art performance.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Exploiting multi-scale contextual prompt learning for zero-shot semantic segmentation☆
    Wang, Yiqi
    Tian, Yingjie
    [J]. DISPLAYS, 2024, 81
  • [2] Multi-scale visual attention for attribute disambiguation in zero-shot learning
    Tian, Long
    Chen, Bo
    Ren, Jie
    Zhang, Hao
    Wu, Zhenhua
    Han, Ning
    Chen, Yuanwei
    Liu, Hongwei
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 103
  • [3] Attentive Semantic Preservation Network for Zero-Shot Learning
    Lu, Ziqian
    Yu, Yunlong
    Lu, Zhe-Ming
    Shen, Feng-Li
    Zhang, Zhongfei
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2919 - 2925
  • [4] Semantic Consistent Embedding for Domain Adaptive Zero-Shot Learning
    Zhang, Jianyang
    Yang, Guowu
    Hu, Ping
    Lin, Guosheng
    Lv, Fengmao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4024 - 4035
  • [5] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
    Chen, Shiming
    Hong, Ziming
    Xie, Guo-Sen
    Yang, Wenhan
    Peng, Qinmu
    Wang, Kai
    Zhao, Jian
    You, Xinge
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7602 - 7611
  • [6] Semantic Autoencoder for Zero-Shot Learning
    Kodirov, Elyor
    Xiang, Tao
    Gong, Shaogang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4447 - 4456
  • [7] Learning semantic ambiguities for zero-shot learning
    Hanouti, Celina
    Le Borgne, Herve
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40745 - 40759
  • [8] Learning semantic ambiguities for zero-shot learning
    Celina Hanouti
    Hervé Le Borgne
    [J]. Multimedia Tools and Applications, 2023, 82 : 40745 - 40759
  • [9] Multi-Scale Speaker Vectors for Zero-Shot Speech Synthesis
    Cory, Tristin
    Iqbal, Razib
    [J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 496 - 501
  • [10] Visual-Semantic Aligned Bidirectional Network for Zero-Shot Learning
    Gao, Rui
    Hou, Xingsong
    Qin, Jie
    Shen, Yuming
    Long, Yang
    Liu, Li
    Zhang, Zhao
    Shao, Ling
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1649 - 1664