STMG: Swin transformer for multi-label image recognition with graph convolution network

被引:0
|
作者
Yangtao Wang
Yanzhao Xie
Lisheng Fan
Guangxing Hu
机构
[1] Guangzhou University,School of Computer Science and Cyber Engineering
[2] Huazhong University of Science and Technology,undefined
来源
Neural Computing and Applications | 2022年 / 34卷
关键词
Swin transformer; Graph convolution network; Multi-label image recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Vision Transformer (ViT) has achieved promising single-label image classification results compared to conventional neural network-based models. Nevertheless, few ViT related studies have explored the label dependencies in the multi-label image recognition field. To this end, we propose STMG that combines transformer and graph convolution network (GCN) to extract the image features and learn the label dependencies for multi-label image recognition. STMG consists of an image representation learning module and a label co-occurrence embedding module. Firstly, in the image representation learning module, to avoid computing the similarity between each two patches, we adopt Swin transformer instead of ViT to generate the image feature for each input image. Secondly, in the label co-occurrence embedding module, we design a two-layer GCN to adaptively capture the label dependencies to output the label co-occurrence embeddings. At last, STMG fuses the image feature and label co-occurrence embeddings to produce the image classification results with the commonly-used multi-label classification loss function and a L2-norm loss function. We conduct extensive experiments on two multi-label image datasets including MS-COCO and FLICKR25K. Experimental results demonstrate STMG can achieve better performance including the convergence efficiency and classification results compared to the state-of-the-art multi-label image recognition methods. Our code is open-sourced and publicly available on GitHub: https://github.com/lzHZWZ/STMG.
引用
收藏
页码:10051 / 10063
页数:12
相关论文
共 50 条
  • [31] M-GCN: Brain-inspired memory graph convolutional network for multi-label image recognition
    Yao, Xiao
    Xu, Feiyang
    Gu, Min
    Wang, Peipei
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08): : 6489 - 6502
  • [32] Double Attention Based on Graph Attention Network for Image Multi-Label Classification
    Zhou, Wei
    Xia, Zhiwu
    Dou, Peng
    Su, Tao
    Hu, Haifeng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
  • [33] A multi-scale semantic attention representation for multi-label image recognition with graph networks
    Liang, Jun
    Xu, Feiteng
    Yu, Songsen
    NEUROCOMPUTING, 2022, 491 : 14 - 23
  • [34] A multi-scale semantic attention representation for multi-label image recognition with graph networks
    Liang, Jun
    Xu, Feiteng
    Yu, Songsen
    Neurocomputing, 2022, 491 : 14 - 23
  • [35] Active learning in multi-label image classification with graph convolutional network embedding
    Xie, Xiurui
    Tian, Maojun
    Luo, Guangchun
    Liu, Guisong
    Wu, Yizhe
    Qin, Ke
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 148 : 56 - 65
  • [36] MSFA: Multi-stage feature aggregation network for multi-label image recognition
    Chen, Jiale
    Xu, Feng
    Zeng, Tao
    Li, Xin
    Chen, Shangjing
    Yu, Jie
    IET IMAGE PROCESSING, 2024, 18 (07) : 1862 - 1877
  • [37] Triplet Transformer Network for Multi-Label Document Classification
    Melsbach, Johannes
    Stahlmann, Sven
    Hirschmeier, Stefan
    Schoder, Detlef
    PROCEEDINGS OF THE 2022 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, DOCENG 2022, 2022,
  • [38] A Unified Modular Framework with Deep Graph Convolutional Networks for Multi-label Image Recognition
    Lin, Qifan
    Chen, Zhaoliang
    Wang, Shiping
    Guo, Wenzhong
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 54 - 65
  • [39] A novel multi-label pest image classifier using the modified Swin Transformer and soft binary cross entropy loss
    Guo, Qingwen
    Wang, Chuntao
    Xiao, Deqin
    Huang, Qiong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [40] Adaptive knowledge graph for multi-label image classification
    Lin, Zhihong
    Tang, Xue-song
    Hao, Kuangrong
    Zhao, Mingbo
    Li, Yubing
    APPLIED INTELLIGENCE, 2025, 55 (01)