Learn from each other to Classify better: Cross-layer mutual attention learning for fine-grained visual classification

被引:19
|
作者
Liu, Dichao [1 ,3 ]
Zhao, Longjiao [1 ]
Wang, Yu [2 ]
Kato, Jien [2 ]
机构
[1] Nagoya Univ, Grad Sch Informat, Furo Cho,Chikusa Ku, Nagoya, Aichi 4648601, Japan
[2] Ritsumeikan Univ, Coll Informat Sci & Engn, 1 Nojihigashi, Kusatsu, Shiga 5250058, Japan
[3] Navier Inc, Res Team, 9-2 Nibancho,Chiyoda Ku, Tokyo 1020084, Japan
关键词
Fine-grained recognition; Image classification; Deep features; NEURAL-NETWORK;
D O I
10.1016/j.patcog.2023.109550
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained visual classification (FGVC) is valuable yet challenging. The difficulty of FGVC mainly lies in its intrinsic inter-class similarity, intra-class variation, and limited training data. Moreover, with the popularity of deep convolutional neural networks, researchers have mainly used deep, abstract, semantic information for FGVC, while shallow, detailed information has been neglected. This work proposes a cross-layer mutual attention learning network (CMAL-Net) to solve the above problems. Specifically, this work views the shallow to deep layers of CNNs as "experts" knowledgeable about different perspectives. We let each expert give a category prediction and an attention region indicating the found clues. Attention regions are treated as information carriers among experts, bringing three benefits: ( i ) helping the model focus on discriminative regions; ( ii ) providing more training data; ( iii ) allowing experts to learn from each other to improve the overall performance. CMAL-Net achieves state-of-the-art performance on three competitive datasets: FGVC-Aircraft, Stanford Cars, and Food-11. The source code is available at https://github.com/Dichao-Liu/CMAL
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Cross-layer progressive attention bilinear fusion method for fine-grained visual classification
    Wang, Chaoqing
    Qian, Yurong
    Gong, Weijun
    Cheng, Junjong
    Wang, Yongqiang
    Wang, Yuefei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 82
  • [2] Fine-grained Cross-Layer Attention Framework for Wound Stage Classification
    Nagda, Keval
    Briden, Michael
    Norouzi, Narges
    2022 IEEE-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS (BHI) JOINTLY ORGANISED WITH THE IEEE-EMBS INTERNATIONAL CONFERENCE ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS (BSN'22), 2022,
  • [3] Adopting Attention and Cross-Layer Features for Fine-Grained Representation
    Sun Fayou
    Ngo, Hea Choon
    Sek, Yong Wee
    IEEE ACCESS, 2022, 10 : 82376 - 82383
  • [4] Learning Hierarchal Channel Attention for Fine-grained Visual Classification
    Guan, Xiang
    Wang, Guoqing
    Xu, Xing
    Bin, Yi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5011 - 5019
  • [5] LEARN MORE: SUB-SIGNIFICANT AREA LEARNING FOR FINE-GRAINED VISUAL CLASSIFICATION
    Pan, Weiyao
    Yang, Shengying
    Qian, Xiaohong
    Lei, Jingsheng
    Zhang, Shuai
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 485 - 489
  • [6] Learning Cascade Attention for fine-grained image classification
    Zhu, Youxiang
    Li, Ruochen
    Yang, Yin
    Ye, Ning
    NEURAL NETWORKS, 2020, 122 : 174 - 182
  • [7] A Progressive Gated Attention Model for Fine-Grained Visual Classification
    Zhu, Qiangxi
    Li, Zhixin
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2063 - 2068
  • [8] A collaborative gated attention network for fine-grained visual classification
    Zhu, Qiangxi
    Kuang, Wenlan
    Li, Zhixin
    DISPLAYS, 2023, 79
  • [9] Hierarchical attention vision transformer for fine-grained visual classification
    Hu, Xiaobin
    Zhu, Shining
    Peng, Taile
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 91
  • [10] Diversified Visual Attention Networks for Fine-Grained Object Classification
    Zhao, Bo
    Wu, Xiao
    Feng, Jiashi
    Peng, Qiang
    Yan, Shuicheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (06) : 1245 - 1256