Coping with change: Learning invariant and minimum sufficient representations for fine-grained visual categorization

被引：3

作者：

Ye, Shuo ^{[1
]}

Yu, Shujian ^{[2
,3
]}

Hou, Wenjin ^{[1
]}

Wang, Yu ^{[1
]}

You, Xinge ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Huibei, Peoples R China

[2] Vrije Univ Amsterdam, Dept Comp Sci, Amsterdam, Netherlands

[3] UiT The Arctic Univ Norway, Machine Learning Grp, Tromso, Norway

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2023年 / 237卷

基金：

国家重点研发计划;

关键词：

Fine-grained visual categorization; Invariant risk minimization; Information bottleneck; ENTROPY;

D O I：

10.1016/j.cviu.2023.103837

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fine-grained visual categorization (FGVC) is a challenging task due to similar visual appearances between various species. Previous studies always implicitly assume that the training and test data have the same underlying distributions, and that features extracted by modern backbone architectures remain discriminative and generalize well to unseen test data. However, we empirically justify that these conditions are not always true on benchmark datasets. To this end, we combine the merits of invariant risk minimization (IRM) and information bottleneck (IB) principle to learn invariant and minimum sufficient (IMS) representations for FGVC, such that the overall model can always discover the most succinct and consistent fine-grained features. We apply the matrix-based Renyi's..-order entropy to simplify and stabilize the training of IB; we also design a ''soft" environment partition scheme to make IRM applicable to FGVC task. To the best of our knowledge, we are the first to address the problem of FGVC from a generalization perspective and develop a new informationtheoretic solution accordingly. Extensive experiments demonstrate the consistent performance gain offered by our IMS. Code is available at: https://github.com/SYe- hub/IMS.

引用

页数：11

共 50 条

[41] Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization
Liu, Chuanbin
Xie, Hongtao
Zha, Zheng-Jun
Ma, Lingfeng
Yu, Lingyun
Zhang, Yongdong
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11555 - 11562
[42] Multiscale attention dynamic aware network for fine-grained visual categorization
Ou, Jichu
Li, Wanyi
Huang, Jingmin
Huang, Xiaojie
Xie, Xuan
ELECTRONICS LETTERS, 2023, 59 (01)
[43] Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization
Ji, Ruyi
Wen, Longyin
Zhang, Libo
Du, Dawei
Wu, Yanjun
Zhao, Chen
Liu, Xianglong
Huang, Feiyue
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10465 - 10474
[44] Multistage attention region supplement transformer for fine-grained visual categorization
Mei, Aokun
Huo, Hua
Xu, Jiaxin
Xu, Ningya
VISUAL COMPUTER, 2025, 41 (03): : 1873 - 1889
[45] Classification-Specific Parts for Improving Fine-Grained Visual Categorization
Korsch, Dimitri
Bodesheim, Paul
Denzler, Joachim
PATTERN RECOGNITION, DAGM GCPR 2019, 2019, 11824 : 62 - 75
[46] Fine-Grained Visual Categorization by Localizing Object Parts With Single Image
Zheng, Xiangtao
Qi, Lei
Ren, Yutao
Lu, Xiaoqiang
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 1187 - 1199
[47] Exploring part-aware segmentation for fine-grained visual categorization
Pang, Cheng
Yao, Hongxun
Sun, Xiaoshuai
Zhao, Sicheng
Zhang, Yanhao
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (23) : 30291 - 30310
[48] Coarse Label Refined Knowledge Reasoning for Fine-Grained Visual Categorization
Zhao, Xiangyu
Peng, Yuxin
INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING, 2018, 11266 : 349 - 359
[49] A benchmark dataset and approach for fine-grained visual categorization in complex scenes
Zhang, Xiang
Zhang, Keran
Zhao, Wanqing
Luo, Hangzai
Zhong, Sheng
Tang, Lei
Peng, Jinye
Fan, Jianping
DIGITAL SIGNAL PROCESSING, 2023, 137
[50] PFNet: a novel part fusion network for fine-grained visual categorization
Jingyun Liang
Jinlin Guo
Yanming Guo
Songyang Lao
Multimedia Tools and Applications, 2020, 79 : 33397 - 33416

← 1 2 3 4 5 →