Coarse to Fine: Multi-label Image Classification with Global/Local Attention

被引:0
|
作者
Lyu, Fan [1 ]
Hu, Fuyuan [1 ]
Sheng, Victor S. [2 ]
Wu, Zhengtian [1 ]
Fu, Qiming [3 ]
Fu, Baochuan [1 ]
机构
[1] Suzhou Univ Sci & Technol, Sch Elect & Informat Engn, Suzhou, Peoples R China
[2] Univ Cent Arkansas, Comp Sci Dept, Conway, AR USA
[3] Jiangsu Prov Key Lab Intelligent Bldg Energy Effi, Suzhou, Peoples R China
关键词
Multi-label image classification; Scene recognition; Deep learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In our daily life, the scenes around us are always with multiple labels especially in a smart city, i.e., recognizing the information of city operation to response and control. Great efforts have been made by using Deep Neural Networks to recognize multi-label images. Since multi-label image classification is very complicated, people seek to use the attention mechanism to guide the classification process. However, conventional attention-based methods always analyzed images directly and aggressively. It is difficult for them to well understand complicated scenes. In this paper, we propose a global/local attention method that can recognize an image from coarse to fine by mimicking how humanbeings observe images. Specifically, our global/local attention method first concentrates on the whole image, and then focuses on local specific objects in the image. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function can further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multilabel image datasets (i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Cross-modal fusion for multi-label image classification with attention mechanism
    Wang, Yangtao
    Xie, Yanzhao
    Zeng, Jiangfeng
    Wang, Hanpin
    Fan, Lisheng
    Song, Yufan
    [J]. Computers and Electrical Engineering, 2022, 101
  • [22] Multi-Label Fundus Image Classification Using Attention Mechanisms and Feature Fusion
    Li, Zhenwei
    Xu, Mengying
    Yang, Xiaoli
    Han, Yanqi
    [J]. MICROMACHINES, 2022, 13 (06)
  • [23] Multi-Label Learning with Global and Local Label Correlation
    Zhu, Yue
    Kwok, James T.
    Zhou, Zhi-Hua
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (06) : 1081 - 1094
  • [24] Graph attention mechanism with global contextual information for multi-label image recognition
    Ban, Xiaoxiao
    Li, Peihua
    Wang, Qilong
    Zhou, Shoujun
    Guo, Shijie
    Wang, Yuanquan
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (06)
  • [25] Global and local attention-based multi-label learning with missing labels
    Cheng, Yusheng
    Qian, Kun
    Min, Fan
    [J]. INFORMATION SCIENCES, 2022, 594 : 20 - 42
  • [26] Multi-Label Text Classification model integrating Label Attention and Historical Attention
    Sun, Guoying
    Cheng, Yanan
    Dong, Fangzhou
    Wang, Luhua
    Zhao, Dong
    Zhang, Zhaoxin
    Tong, Xiaojun
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [27] Aligning Image Semantics and Label Concepts for Image Multi-Label Classification
    Zhou, Wei
    Xia, Zhiwu
    Dou, Peng
    Su, Tao
    Hu, Haifeng
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (02)
  • [28] General Multi-label Image Classification with Transformers
    Lanchantin, Jack
    Wang, Tianlu
    Ordonez, Vicente
    Qi, Yanjun
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16473 - 16483
  • [29] A Deep Multi-Attention Driven Approach for Multi-Label Remote Sensing Image Classification
    Sumbul, Gencer
    Demir, Begum
    [J]. IEEE ACCESS, 2020, 8 : 95934 - 95946
  • [30] A NOVEL MULTI-ATTENTION DRIVEN SYSTEM FOR MULTI-LABEL REMOTE SENSING IMAGE CLASSIFICATION
    Sumbul, Gencer
    Demir, Begum
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 5726 - 5729