Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization

被引:142
|
作者
Ji, Ruyi [1 ,2 ]
Wen, Longyin [3 ]
Zhang, Libo [1 ]
Du, Dawei [4 ]
Wu, Yanjun [1 ]
Zhao, Chen [1 ]
Liu, Xianglong [5 ]
Huang, Feiyue [6 ]
机构
[1] ISCAS, State Key Lab Comp Sci, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] JD Finance Amer Corp, Mountain View, CA USA
[4] SUNY Albany, Albany, NY 12222 USA
[5] Beihang Univ, Beijing, Peoples R China
[6] Tencent Youtu Lab, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR42600.2020.01048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained visual categorization (FGVC) is an important but challenging task due to high intra-class variances and low inter-class variances caused by deformation, occlusion, illumination, etc. An attention convolutional binary neural tree is presented to address those problems for weakly supervised FGVC. Specifically, we incorporate convolutional operations along edges of the tree structure, and use the routing functions in each node to determine the root-to-leaf computational paths within the tree. The final decision is computed as the summation of the predictions from leaf nodes. The deep convolutional operations learn to capture the representations of objects, and the tree structure characterizes the coarse-to-fine hierarchical feature learning process. In addition, we use the attention transformer module to enforce the network to capture discriminative features. Several experiments on the CUB200-2011, Stanford Cars and Aircraft datasets demonstrate that our method performs favorably against the state-of-the-arts. Code can be found at https://isrc.iscas.ac.cn/gitlab/research/acnet.
引用
收藏
页码:10465 / 10474
页数:10
相关论文
共 50 条
  • [31] ProtoSimi: label correction for fine-grained visual categorization
    Jialiang Shen
    Yu Yao
    Shaoli Huang
    Zhiyong Wang
    Jing Zhang
    Ruxing Wang
    Jun Yu
    Tongliang Liu
    Machine Learning, 2024, 113 : 1903 - 1920
  • [32] Fine-grained Event Categorization with Heterogeneous Graph Convolutional Networks
    Peng, Hao
    Li, Jianxin
    Gong, Qiran
    Song, Yangqiu
    Ning, Yuanxing
    Lai, Kunfeng
    Yu, Philip S.
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3238 - 3245
  • [33] AP-CNN: Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification
    Ding, Yifeng
    Ma, Zhanyu
    Wen, Shaoguo
    Xie, Jiyang
    Chang, Dongliang
    Si, Zhongwei
    Wu, Ming
    Ling, Haibin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2826 - 2836
  • [34] Fine-Grained Visual Classification Based on Sparse Bilinear Convolutional Neural Network
    Ma L.
    Wang Y.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (04): : 336 - 344
  • [35] Fine-Grained Categorization by Alignments
    Gavves, E.
    Fernando, B.
    Snoek, C. G. M.
    Smeulders, A. W. M.
    Tuytelaars, T.
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1713 - 1720
  • [36] Adaptive Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition
    Li, Ang
    Chen, Jianxin
    Kang, Bin
    Zhuang, Wenqin
    Zhang, Xuguang
    2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
  • [37] Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition
    Zheng, Heliang
    Fu, Jianlong
    Mei, Tao
    Luo, Jiebo
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5219 - 5227
  • [38] Recombining Vision Transformer Architecture for Fine-Grained Visual Categorization
    Deng, Xuran
    Liu, Chuanbin
    Lu, Zhiying
    MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 127 - 138
  • [39] Fine-grained Visual Categorization with 2D-Warping
    Hanselmann, Harald
    Ney, Hermann
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 608 - 613
  • [40] Cross-X Learning for Fine-Grained Visual Categorization
    Luo, Wei
    Yang, Xitong
    Mo, Xianjie
    Lu, Yuheng
    Davis, Larry S.
    Li, Jun
    Yang, Jian
    Lim, Ser-Nam
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8241 - 8250