Double Attention Based on Graph Attention Network for Image Multi-Label Classification

被引:13
|
作者
Zhou, Wei [1 ]
Xia, Zhiwu [1 ]
Dou, Peng [1 ]
Su, Tao [1 ]
Hu, Haifeng [1 ]
机构
[1] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-label classification; label correlation; channel attention mechanism; graph attention network; visual analysis; EFFICIENT;
D O I
10.1145/3519030
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The task of image multi-label classification is to accurately recognize multiple objects in an input image. Most of the recent works need to leverage the label co-occurrence matrix counted from training data to construct the graph structure, which are inflexible and may degrade model generalizability. In addition, these methods fail to capture the semantic correlation between the channel feature maps to further improve model performance. To address these issues, we propose DA-GAT (a Double Attention framework based on the Graph Attention neTwork) to effectively learn the correlation between labels from training data. First, we devise a new channel attention mechanism to enhance the semantic correlation between channel feature maps, so as to implicitly capture the correlation between labels. Second, we propose a new label attention mechanism to avoid the adverse impact of a manually constructed label co-occurrence matrix. It only needs to leverage the label embedding as the input of network, then automatically constructs the label relation matrix to explicitly establish the correlation between labels. Finally, we effectively fuse the output of these two attention mechanisms to further improve model performance. Extensive experiments are conducted on three public multi-label classification benchmarks. Our DA-GAT model achieves mean average precision of 87.1%, 96.6%, and 64.3% on MS-COCO 2014, PASCAL VOC 2007, and NUS-WIDE, respectively, and obviously outperforms other existing state-of-the-art methods. In addition, visual analysis experiments demonstrate that each attention mechanism can capture the correlation between labels well and significantly promote the model performance.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Graph Attention Transformer Network for Multi-label Image Classification
    Yuan, Jin
    Chen, Shikai
    Zhang, Yao
    Shi, Zhongchao
    Geng, Xin
    Fan, Jianping
    Rui, Yong
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
  • [2] Double Attention for Multi-Label Image Classification
    Zhao, Haiying
    Zhou, Wei
    Hou, Xiaogang
    Zhu, Hui
    [J]. IEEE ACCESS, 2020, 8 : 225539 - 225550
  • [3] Multi-Label Image Classification by Feature Attention Network
    Yan, Zheng
    Liu, Weiwei
    Wen, Shiping
    Yang, Yin
    [J]. IEEE ACCESS, 2019, 7 : 98005 - 98013
  • [4] Visual Attention in Multi-Label Image Classification
    Luo, Yan
    Jiang, Ming
    Zhao, Qi
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 820 - 827
  • [5] A Weighted Graph Attention Network Based Method for Multi-label Classification of Electrocardiogram Abnormalities
    Wang, Hongmei
    Zhao, Wei
    Li, Zhenqi
    Jia, Dongya
    Yan, Cong
    Hu, Jing
    Fang, Jiansheng
    Yang, Ming
    [J]. 42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 418 - 421
  • [6] Attention-Augmented Memory Network for Image Multi-Label Classification
    Zhou, Wei
    Hou, Yanke
    Chen, Dihu
    Hu, Haifeng
    Su, Tao
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (03)
  • [7] MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network
    Pal, Ankit
    Selvakumar, Muru
    Sankarasubbu, Malaikannan
    [J]. ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2020, : 494 - 505
  • [8] Multi-label Feature Extraction With Distance-Based Graph Attention Network
    Peng, Yue
    Qian, Kun
    Song, Guojie
    Min, Fan
    [J]. ROUGH SETS, IJCRS 2022, 2022, 13633 : 203 - 216
  • [9] DATran: Dual Attention Transformer for Multi-Label Image Classification
    Zhou, Wei
    Zheng, Zhijie
    Su, Tao
    Hu, Haifeng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 342 - 356
  • [10] Pose Guided Attention for Multi-label Fashion Image Classification
    Ferreira, Beatriz Quintino
    Costeira, Joao P.
    Sousa, Ricardo G.
    Gui, Liang-Yan
    Gomes, Joao P.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3125 - 3128