Semantic-Guided Information Alignment Network for Fine-Grained Image Recognition

被引:4
|
作者
Wang, Shijie [1 ]
Wang, Zhihui [1 ]
Li, Haojie [1 ]
Chang, Jianlong [2 ]
Ouyang, Wanli [3 ]
Tian, Qi [2 ]
机构
[1] Dalian Univ Technol, Int Sch Informat Sci & Engn, Dalian 116024, Peoples R China
[2] Huawei Cloud & AI, Shenzhen 518000, Peoples R China
[3] Univ Sydney, Sense Time Comp Vis Res Grp, Camperdown, NSW 2006, Australia
基金
中国国家自然科学基金;
关键词
Fine-grained image recognition; accurate semantic calibration; discriminative feature alignment; MODEL;
D O I
10.1109/TCSVT.2023.3263870
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Existing fine-grained image recognition works have attempted to dig into low-level details for emphasizing subtle discrepancies among sub-categories. However, a potential limitation of these methods is that they integrate the low-level details and high-level semantics directly, and neglect their content complementarity and spatial corresponding correlation. To handle this limitation, we propose an end-to-end Semantic-guided Information Alignment Network (SIA-Net) to dynamically pick out the low-level details under the guidance of accurate semantics to make selected details spatially corresponding to high-level semantics and complementary in content. Technically, SIA-Net consists of an Accurate Semantic Calibration (ASC) module for providing accurate semantics and a Discriminative Feature Alignment (DFA) module for aggregating low-level details and high-level semantics using accurate semantics generated by ASC. ASC learns the pixel-level feature shifting caused by convolutional operations, which is utilized for replacing the incorrectly highlighted semantics by shifting discriminative semantics or background features. After obtaining the accurate semantic features, DFA digs into the complementary details and simultaneously makes the selected details spatially corresponding via applying the guidance of accurate semantics to obtain the reassembly features. Finally, the reassembly features, which serve as discriminative cues, are used for more accurate discriminative region localization. Extensive experiments verify that our proposed method yields the best performance under the same settings with the most competitive approaches on CUB-birds, Stanford-Cars, and FGVC Aircraft datasets.
引用
收藏
页码:6558 / 6570
页数:13
相关论文
共 50 条
  • [1] Fine-grained and Semantic-guided Visual Attention for Image Captioning
    Zhang, Zongjian
    Wu, Qiang
    Wang, Yang
    Chen, Fang
    [J]. 2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1709 - 1717
  • [2] High-Quality Image Captioning With Fine-Grained and Semantic-Guided Visual Attention
    Zhang, Zongjian
    Wu, Qiang
    Wang, Yang
    Chen, Fang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (07) : 1681 - 1693
  • [3] Contrastive Semantic-Guided Image Smoothing Network
    Wang, Jie
    Wang, Yongzhen
    Feng, Yidan
    Gong, Lina
    Yan, Xuefeng
    Xie, Haoran
    Wang, Fu Lee
    Wei, Mingqiang
    [J]. COMPUTER GRAPHICS FORUM, 2022, 41 (07) : 335 - 346
  • [4] Class Guided Channel Weighting Network for Fine-Grained Semantic Segmentation
    Zhang, Xiang
    Zhao, Wanqing
    Luo, Hangzai
    Peng, Jinye
    Fan, Jianping
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3344 - 3352
  • [5] A Semantic-driven Image Scene Fine-grained Enhancement Recognition
    Qu, Dongyang
    Li, Yaling
    Luo, Xiaoyan
    Shi, Xiaofeng
    [J]. SEVENTH ASIA PACIFIC CONFERENCE ON OPTICS MANUFACTURE (APCOM 2021), 2022, 12166
  • [6] Feature Correlation Residual Network for Fine-Grained Image Recognition
    Xu, Jiazhen
    Wei, Yantao
    Deng, Wei
    [J]. IEEE ACCESS, 2020, 8 : 214322 - 214331
  • [7] Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment
    Shen, Dinghan
    Zhang, Xinyuan
    Henao, Ricardo
    Carin, Lawrence
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1829 - 1838
  • [8] Fine-grained Semantic Alignment Network forWeakly Supervised Temporal Language Grounding
    Wang, Yuechen
    Zhou, Wengang
    Li, Houqiang
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 89 - 99
  • [9] Semantic bilinear pooling for fine-grained recognition
    School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
    [J]. Proc. Int. Conf. Pattern Recognit., (3660-3666):
  • [10] Semantic Bilinear Pooling for Fine-Grained Recognition
    Li, Xinjie
    Yang, Chun
    Chen, Song-Lu
    Zhu, Chao
    Yin, Xu-Cheng
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3660 - 3666