An Erudite Fine-Grained Visual Classification Model

被引：3

作者：

Chang, Dongliang ^{[1
]}

Tong, Yujun ^{[1
]}

Du, Ruoyi ^{[1
]}

Hospedales, Timothy ^{[2
]}

Song, Yi-Zhe ^{[3
]}

Ma, Zhanyu ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommunicat, Beijing, Peoples R China

[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland

[3] Univ Surrey, SketchX, CVSSP, Surrey, England

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

D O I：

10.1109/CVPR52729.2023.00702

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current fine-grained visual classification (FGVC) models are isolated. In practice, we first need to identify the coarse-grained label of an object, then select the corresponding FGVC model for recognition. This hinders the application of FGVC algorithms in real-life scenarios. In this paper, we propose an erudite FGVC model jointly trained by several different datasets(1), which can efficiently and accurately predict an object's fine-grained label across the combined label space. We found through a pilot study that positive and negative transfers co-occur when different datasets are mixed for training, i.e., the knowledge from other datasets is not always useful. Therefore, we first propose a feature disentanglement module and a feature re-fusion module to reduce negative transfer and boost positive transfer between different datasets. In detail, we reduce negative transfer by decoupling the deep features through many dataset-specific feature extractors. Subsequently, these are channel-wise re-fused to facilitate positive transfer. Finally, we propose a meta-learning based dataset-agnostic spatial attention layer to take full advantage of the multi-dataset training data, given that localisation is dataset-agnostic between different datasets. Experimental results across 11 different mixed-datasets built on four different FGVC datasets demonstrate the effectiveness of the proposed method. Furthermore, the proposed method can be easily combined with existing FGVC methods to obtain state-of-the-art results. Our code is available at https:// github.com/PRIS-CV/An-Erudite-FGVC-Model.

引用

页码：7268 / 7277

页数：10

共 50 条

[1] Leveraging Fine-Grained Labels to Regularize Fine-Grained Visual Classification
Wu, Junfeng
Yao, Li
Liu, Bin
Ding, Zheyuan
[J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON COMPUTER MODELING AND SIMULATION (ICCMS 2019) AND 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND APPLICATIONS (ICICA 2019), 2019, : 133 - 136
[2] A Progressive Gated Attention Model for Fine-Grained Visual Classification
Zhu, Qiangxi
Li, Zhixin
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2063 - 2068
[3] Pairwise Confusion for Fine-Grained Visual Classification
Dubey, Abhimanyu
Gupta, Otkrist
Guo, Pei
Raskar, Ramesh
Farrell, Ryan
Naik, Nikhil
[J]. COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 71 - 88
[4] Efficient Image Embedding for Fine-Grained Visual Classification
Payatsuporn, Soranan
Kijsirikul, Boonserm
[J]. 2022-14TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST 2022), 2022, : 40 - 45
[5] Adaptive Destruction Learning for Fine-grained Visual Classification
Zhang, Riheng
Tan, Min
Mao, Xiaoyang
Gao, Zhigang
Gu, Xiaoling
[J]. 2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 946 - 950
[6] Exploration of Class Center for Fine-Grained Visual Classification
Yao, Hang
Miao, Qiguang
Zhao, Peipei
Li, Chaoneng
Li, Xin
Feng, Guanwen
Liu, Ruyi
[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (10) : 9954 - 9966
[7] A sparse focus framework for visual fine-grained classification
Wang, YongXiong
Li, Guangjun
Ma, Li
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (16) : 25271 - 25289
[8] A sparse focus framework for visual fine-grained classification
YongXiong Wang
Guangjun Li
Li Ma
[J]. Multimedia Tools and Applications, 2021, 80 : 25271 - 25289
[9] Dual Transformer With Multi-Grained Assembly for Fine-Grained Visual Classification
Ji, Ruyi
Li, Jiaying
Zhang, Libo
Liu, Jing
Wu, Yanjun
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5009 - 5021
[10] Learning Hierarchal Channel Attention for Fine-grained Visual Classification
Guan, Xiang
Wang, Guoqing
Xu, Xing
Bin, Yi
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5011 - 5019

← 1 2 3 4 5 →