Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization

被引：142

作者：

Ji, Ruyi ^{[1
,2
]}

Wen, Longyin ^{[3
]}

Zhang, Libo ^{[1
]}

Du, Dawei ^{[4
]}

Wu, Yanjun ^{[1
]}

Zhao, Chen ^{[1
]}

Liu, Xianglong ^{[5
]}

Huang, Feiyue ^{[6
]}

机构：

[1] ISCAS, State Key Lab Comp Sci, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] JD Finance Amer Corp, Mountain View, CA USA

[4] SUNY Albany, Albany, NY 12222 USA

[5] Beihang Univ, Beijing, Peoples R China

[6] Tencent Youtu Lab, Beijing, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR42600.2020.01048

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fine-grained visual categorization (FGVC) is an important but challenging task due to high intra-class variances and low inter-class variances caused by deformation, occlusion, illumination, etc. An attention convolutional binary neural tree is presented to address those problems for weakly supervised FGVC. Specifically, we incorporate convolutional operations along edges of the tree structure, and use the routing functions in each node to determine the root-to-leaf computational paths within the tree. The final decision is computed as the summation of the predictions from leaf nodes. The deep convolutional operations learn to capture the representations of objects, and the tree structure characterizes the coarse-to-fine hierarchical feature learning process. In addition, we use the attention transformer module to enforce the network to capture discriminative features. Several experiments on the CUB200-2011, Stanford Cars and Aircraft datasets demonstrate that our method performs favorably against the state-of-the-arts. Code can be found at https://isrc.iscas.ac.cn/gitlab/research/acnet.

引用

页码：10465 / 10474

页数：10

共 50 条

[31] ProtoSimi: label correction for fine-grained visual categorization
Jialiang Shen
Yu Yao
Shaoli Huang
Zhiyong Wang
Jing Zhang
Ruxing Wang
Jun Yu
Tongliang Liu
Machine Learning, 2024, 113 : 1903 - 1920
[32] Fine-grained Event Categorization with Heterogeneous Graph Convolutional Networks
Peng, Hao
Li, Jianxin
Gong, Qiran
Song, Yangqiu
Ning, Yuanxing
Lai, Kunfeng
Yu, Philip S.
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3238 - 3245
[33] AP-CNN: Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification
Ding, Yifeng
Ma, Zhanyu
Wen, Shaoguo
Xie, Jiyang
Chang, Dongliang
Si, Zhongwei
Wu, Ming
Ling, Haibin
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2826 - 2836
[34] Fine-Grained Visual Classification Based on Sparse Bilinear Convolutional Neural Network
Ma L.
Wang Y.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (04): : 336 - 344
[35] Fine-Grained Categorization by Alignments
Gavves, E.
Fernando, B.
Snoek, C. G. M.
Smeulders, A. W. M.
Tuytelaars, T.
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1713 - 1720
[36] Adaptive Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition
Li, Ang
Chen, Jianxin
Kang, Bin
Zhuang, Wenqin
Zhang, Xuguang
2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
[37] Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition
Zheng, Heliang
Fu, Jianlong
Mei, Tao
Luo, Jiebo
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5219 - 5227
[38] Recombining Vision Transformer Architecture for Fine-Grained Visual Categorization
Deng, Xuran
Liu, Chuanbin
Lu, Zhiying
MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 127 - 138
[39] Fine-grained Visual Categorization with 2D-Warping
Hanselmann, Harald
Ney, Hermann
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 608 - 613
[40] Cross-X Learning for Fine-Grained Visual Categorization
Luo, Wei
Yang, Xitong
Mo, Xianjie
Lu, Yuheng
Davis, Larry S.
Li, Jun
Yang, Jian
Lim, Ser-Nam
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8241 - 8250

← 1 2 3 4 5 →