DCMA-Net: dual cross-modal attention for fine-grained few-shot recognition

被引：0

作者：

Yan Zhou

Xiao Ren

Jianxun Li

Yin Yang

Haibin Zhou

机构：

[1] Xiangtan University,School of Automation and Electronic Information

[2] Shanghai Jiao Tong Universityty,School of Electronic Information and Electrical Engineering

[3] Xiangtan University,School of Mathematics and Computational Science

来源：

Multimedia Tools and Applications | 2024年 / 83卷

关键词：

Few-shot learning; Fine-grained image recognition; Attention mechanism; Cross-modal fusion; Prototype;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Since obtaining comprehensive labeled samples is expensive, the Fine-grained Few-shot Recognition task aims to identify unseen meta classes by using one or several labeled known meta classes. Besides, Fine-grained Recognition suffers some challenges such as minimal inter-class variation, backgrounds clutter, and most of the previous methods are single visual modality. In this paper, we propose a novel Dual Cross-modal Attention Network (DCMA-Net) to address the mentioned problems. Concretely, we first propose the Local Mutuality Attention branch that encodes contextual information by merging cross-modal information to learn more discriminatory information and increase inter-class differences. Meanwhile, we add a regularization mechanism to filter the visual features that match the attribute information to ensure the effectiveness of learning. Focusing on local features is easy to ignore instance information, so we propose the Global Correlation Attention branch which gains details activation representation acquired by global pooling of visual features serially in spatial and channel dimensions. This branch avoids learning bias as the counterpart of the Local Mutuality Attention branch. After that, both outputs of the two branches are aggregated as an integral feature embedding, which can be used to enhance the prototypes. Extensive experiments on CUB and SUN datasets demonstrate that our framework is effective. Particularly, our method has improved the accuracy of Prototype Network from 51.31 to 77.67 on 5-way 1-shot scenarios on the CUB dataset with Conv-4 backbone.

引用

页码：14521 / 14537

页数：16

共 50 条

[41] Fine-Grained Cross-Modal Retrieval for Cultural Items with Focal Attention and Hierarchical Encodings
Sheng, Shurong
Laenen, Katrien
Van Gool, Luc
Moens, Marie-Francine
[J]. COMPUTERS, 2021, 10 (09)
[42] M3Net: Multi-view Encoding, Matching, and Fusion for Few-shot Fine-grained Action Recognition
Tang, Hao
Liu, Jun
Yan, Shuanglin
Yan, Rui
Li, Zechao
Tang, Jinhui
[J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1719 - 1728
[43] KNOWLEDGE-BASED FINE-GRAINED CLASSIFICATION FOR FEW-SHOT LEARNING
Zhao, Jiabao
Lin, Xin
Zhou, Jie
Yang, Jing
He, Liang
Yang, Zhaohui
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[44] A Multiview Metric Learning Method for Few-Shot Fine-Grained Classification
Miao, Zhuang
Zhao, Xun
Wang, Jiabao
Xu, Bo
Li, Yang
Li, Hang
[J]. IEEE ACCESS, 2022, 10 : 52782 - 52790
[45] Few-shot Visual Learning with Contextual Memory and Fine-grained Calibration
Ma, Yuqing
Liu, Wei
Bai, Shihao
Zhang, Qingyu
Liu, Aishan
Chen, Weimin
Liu, Xianglong
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 811 - 817
[46] Relation Awareness Network for Few-Shot Fine-Grained Fault Diagnosis
Xu, Yan
Ma, Xinyao
Wang, Xuan
Wang, Jinjia
Tang, Gang
Ji, Zhong
[J]. IEEE SENSORS JOURNAL, 2024, 24 (13) : 20949 - 20958
[47] Self-reconstruction network for fine-grained few-shot classification
Li, Xiaoxu
Li, Zhen
Xie, Jiyang
Yang, Xiaochen
Xue, Jing-Hao
Ma, Zhanyu
[J]. PATTERN RECOGNITION, 2024, 152
[48] Feature fusion network based on few-shot fine-grained classification
Yang, Yajie
Feng, Yuxuan
Zhu, Li
Fu, Haitao
Pan, Xin
Jin, Chenlei
[J]. FRONTIERS IN NEUROROBOTICS, 2023, 17
[49] Fine-grained Relational Learning for Few-shot Knowledge Graph Completion
Yuan, Xu
Lei, Qihang
Yu, Shuo
Xu, Chengchuan
Chen, Zhikui
[J]. APPLIED COMPUTING REVIEW, 2022, 22 (03): : 25 - 38
[50] Few-Shot Font Generation by Learning Fine-Grained Local Styles
Tang, Licheng
Cai, Yiyang
Liu, Jiaming
Hong, Zhibin
Gong, Mingming
Fan, Minhu
Han, Junyu
Liu, Jingtuo
Ding, Errui
Wang, Jingdong
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7885 - 7894

← 1 2 3 4 5 →