Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification

被引:51
|
作者
Wei, Xing [1 ]
Zhang, Yue [1 ]
Gong, Yihong [1 ]
Zhang, Jiawei [2 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China
[2] SenseTime Res, Shenzhen, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Fine-grained visual classification; Bilinear pooling; Singular Value Decomposition; Grassmann manifold; Visual burstiness;
D O I
10.1007/978-3-030-01219-9_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Designing discriminative and invariant features is the key to visual recognition. Recently, the bilinear pooled feature matrix of Convolutional Neural Network (CNN) has shown to achieve state-of-the-art performance on a range of fine-grained visual recognition tasks. The bilinear feature matrix collects second-order statistics and is closely related to the covariance matrix descriptor. However, the bilinear feature could suffer from the visual burstiness phenomenon similar to other visual representations such as VLAD and Fisher Vector. The reason is that the bilinear feature matrix is sensitive to the magnitudes and correlations of local CNN feature elements which can be measured by its singular values. On the other hand, the singular vectors are more invariant and reasonable to be adopted as the feature representation. Motivated by this point, we advocate an alternative pooling method which transforms the CNN feature matrix to an orthonormal matrix consists of its principal singular vectors. Geometrically, such orthonormal matrix lies on the Grassmann manifold, a Riemannian manifold whose points represent subspaces of the Euclidean space. Similarity measurement of images reduces to comparing the principal angles between these "homogeneous" subspaces and thus is independent of the magnitudes and correlations of local CNN activations. In particular, we demonstrate that the projection distance on the Grassmann manifold deduces a bilinear feature mapping without explicitly computing the bilinear feature matrix, which enables a very compact feature and classifier representation. Experimental results show that our method achieves an excellent balance of model complexity and accuracy on a variety of fine-grained image classification datasets.
引用
收藏
页码:365 / 380
页数:16
相关论文
共 50 条
  • [1] Attention Bilinear Pooling for Fine-Grained Classification
    Wang, Wenqian
    Zhang, Jun
    Wang, Fenglei
    [J]. SYMMETRY-BASEL, 2019, 11 (08):
  • [2] Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition
    Yu, Chaojian
    Zhao, Xinyi
    Zheng, Qi
    Zhang, Peng
    You, Xinge
    [J]. COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 595 - 610
  • [3] Squeezed Bilinear Pooling for Fine-Grained Visual Categorization
    Liao, Qiyu
    Wang, Dadong
    Holewa, Hamish
    Xu, Min
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 728 - 732
  • [4] Grouping Bilinear Pooling for Fine-Grained Image Classification
    Zeng, Rui
    He, Jingsong
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (10):
  • [5] Fine-grained visual classification via multilayer bilinear pooling with object localization
    Li, Ming
    Lei, Lin
    Sun, Hao
    Li, Xiao
    Kuang, Gangyao
    [J]. VISUAL COMPUTER, 2022, 38 (03): : 811 - 820
  • [6] Fine-grained visual classification via multilayer bilinear pooling with object localization
    Ming Li
    Lin Lei
    Hao Sun
    Xiao Li
    Gangyao Kuang
    [J]. The Visual Computer, 2022, 38 : 811 - 820
  • [7] Saliency Enhanced Hierarchical Bilinear Pooling for Fine-Grained Classification
    Chen, Junying
    Chen, Ying
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2021, 33 (02): : 241 - 249
  • [8] Low-rank Bilinear Pooling for Fine-Grained Classification
    Kong, Shu
    Fowlkes, Charless
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7025 - 7034
  • [9] Semantic Bilinear Pooling for Fine-Grained Recognition
    Li, Xinjie
    Yang, Chun
    Chen, Song-Lu
    Zhu, Chao
    Yin, Xu-Cheng
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3660 - 3666
  • [10] Semantic bilinear pooling for fine-grained recognition
    School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
    [J]. Proc. Int. Conf. Pattern Recognit., (3660-3666):