DCMA-Net: dual cross-modal attention for fine-grained few-shot recognition

被引:0
|
作者
Yan Zhou
Xiao Ren
Jianxun Li
Yin Yang
Haibin Zhou
机构
[1] Xiangtan University,School of Automation and Electronic Information
[2] Shanghai Jiao Tong Universityty,School of Electronic Information and Electrical Engineering
[3] Xiangtan University,School of Mathematics and Computational Science
来源
关键词
Few-shot learning; Fine-grained image recognition; Attention mechanism; Cross-modal fusion; Prototype;
D O I
暂无
中图分类号
学科分类号
摘要
Since obtaining comprehensive labeled samples is expensive, the Fine-grained Few-shot Recognition task aims to identify unseen meta classes by using one or several labeled known meta classes. Besides, Fine-grained Recognition suffers some challenges such as minimal inter-class variation, backgrounds clutter, and most of the previous methods are single visual modality. In this paper, we propose a novel Dual Cross-modal Attention Network (DCMA-Net) to address the mentioned problems. Concretely, we first propose the Local Mutuality Attention branch that encodes contextual information by merging cross-modal information to learn more discriminatory information and increase inter-class differences. Meanwhile, we add a regularization mechanism to filter the visual features that match the attribute information to ensure the effectiveness of learning. Focusing on local features is easy to ignore instance information, so we propose the Global Correlation Attention branch which gains details activation representation acquired by global pooling of visual features serially in spatial and channel dimensions. This branch avoids learning bias as the counterpart of the Local Mutuality Attention branch. After that, both outputs of the two branches are aggregated as an integral feature embedding, which can be used to enhance the prototypes. Extensive experiments on CUB and SUN datasets demonstrate that our framework is effective. Particularly, our method has improved the accuracy of Prototype Network from 51.31 to 77.67 on 5-way 1-shot scenarios on the CUB dataset with Conv-4 backbone.
引用
收藏
页码:14521 / 14537
页数:16
相关论文
共 50 条
  • [41] Fine-Grained Cross-Modal Retrieval for Cultural Items with Focal Attention and Hierarchical Encodings
    Sheng, Shurong
    Laenen, Katrien
    Van Gool, Luc
    Moens, Marie-Francine
    [J]. COMPUTERS, 2021, 10 (09)
  • [42] M3Net: Multi-view Encoding, Matching, and Fusion for Few-shot Fine-grained Action Recognition
    Tang, Hao
    Liu, Jun
    Yan, Shuanglin
    Yan, Rui
    Li, Zechao
    Tang, Jinhui
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1719 - 1728
  • [43] KNOWLEDGE-BASED FINE-GRAINED CLASSIFICATION FOR FEW-SHOT LEARNING
    Zhao, Jiabao
    Lin, Xin
    Zhou, Jie
    Yang, Jing
    He, Liang
    Yang, Zhaohui
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [44] A Multiview Metric Learning Method for Few-Shot Fine-Grained Classification
    Miao, Zhuang
    Zhao, Xun
    Wang, Jiabao
    Xu, Bo
    Li, Yang
    Li, Hang
    [J]. IEEE ACCESS, 2022, 10 : 52782 - 52790
  • [45] Few-shot Visual Learning with Contextual Memory and Fine-grained Calibration
    Ma, Yuqing
    Liu, Wei
    Bai, Shihao
    Zhang, Qingyu
    Liu, Aishan
    Chen, Weimin
    Liu, Xianglong
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 811 - 817
  • [46] Relation Awareness Network for Few-Shot Fine-Grained Fault Diagnosis
    Xu, Yan
    Ma, Xinyao
    Wang, Xuan
    Wang, Jinjia
    Tang, Gang
    Ji, Zhong
    [J]. IEEE SENSORS JOURNAL, 2024, 24 (13) : 20949 - 20958
  • [47] Self-reconstruction network for fine-grained few-shot classification
    Li, Xiaoxu
    Li, Zhen
    Xie, Jiyang
    Yang, Xiaochen
    Xue, Jing-Hao
    Ma, Zhanyu
    [J]. PATTERN RECOGNITION, 2024, 152
  • [48] Feature fusion network based on few-shot fine-grained classification
    Yang, Yajie
    Feng, Yuxuan
    Zhu, Li
    Fu, Haitao
    Pan, Xin
    Jin, Chenlei
    [J]. FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [49] Fine-grained Relational Learning for Few-shot Knowledge Graph Completion
    Yuan, Xu
    Lei, Qihang
    Yu, Shuo
    Xu, Chengchuan
    Chen, Zhikui
    [J]. APPLIED COMPUTING REVIEW, 2022, 22 (03): : 25 - 38
  • [50] Few-Shot Font Generation by Learning Fine-Grained Local Styles
    Tang, Licheng
    Cai, Yiyang
    Liu, Jiaming
    Hong, Zhibin
    Gong, Mingming
    Fan, Minhu
    Han, Junyu
    Liu, Jingtuo
    Ding, Errui
    Wang, Jingdong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7885 - 7894