Multi-label Image Classification via Coarse-to-Fine Attention*

被引:6
|
作者
Lyu, Fan [1 ,2 ]
Li, Linyan [3 ]
Victor, S. Sheng [4 ]
Fu, Qiming [1 ]
Hu, Fuyuan [1 ]
机构
[1] Suzhou Univ Sci & Technol, Sch Elect & Informat Engn, Suzhou 215009, Peoples R China
[2] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China
[3] Suzhou Inst Trade & Commerce, Suzhou 215009, Peoples R China
[4] Texas Tech Univ, Dept Comp Sci, Lubbock, TX 79409 USA
基金
中国国家自然科学基金;
关键词
feature extraction; image classification; image recognition; image representation; learning (artificial intelligence); neural nets; object detection; visual databases; attention mechanism; conventional attention-based methods; positive labels; negative labels; image classification method; popular multilabel image datasets; multilabel image classification; coarse-to-fine attention; Multi-label classification; Convolutional neural network; Recurrent neural network; Attention;
D O I
10.1049/cje.2019.07.015
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Great efforts have been made by using deep neural networks to recognize multi-label images. Since multi-label image classification is very complicated, many studies seek to use the attention mechanism as a kind of guidance. Conventional attention-based methods always analyzed images directly and aggressively, which is difficult to well understand complicated scenes. We propose a global/local attention method that can recognize a multi-label image from coarse to fine by mimicking how human-beings observe images. Our global/local attention method first concentrates on the whole image, and then focuses on its local specific objects. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multi-label image datasets (i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.
引用
收藏
页码:1118 / 1126
页数:9
相关论文
共 50 条
  • [31] General Multi-label Image Classification with Transformers
    Lanchantin, Jack
    Wang, Tianlu
    Ordonez, Vicente
    Qi, Yanjun
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16473 - 16483
  • [32] A NOVEL MULTI-ATTENTION DRIVEN SYSTEM FOR MULTI-LABEL REMOTE SENSING IMAGE CLASSIFICATION
    Sumbul, Gencer
    Demir, Begum
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 5726 - 5729
  • [33] Image to Text Translation by Multi-Label Classification
    Nasierding, Gulisong
    Kouzani, Abbas Z.
    [J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2010, 6216 : 247 - +
  • [34] MULTIMODAL LEARNING FOR MULTI-LABEL IMAGE CLASSIFICATION
    Pang, Yanwei
    Ma, Zhao
    Yuan, Yuan
    Li, Xuelong
    Wang, Kongqiao
    [J]. 2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 1797 - 1800
  • [35] Causal multi-label learning for image classification
    Tian, Yingjie
    Bai, Kunlong
    Yu, Xiaotong
    Zhu, Siyu
    [J]. NEURAL NETWORKS, 2023, 167 : 626 - 637
  • [36] Multi-label Active Learning for Image Classification
    Wu, Jian
    Sheng, Victor S.
    Zhang, Jing
    Zhao, Pengpeng
    Cui, Zhiming
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 5227 - 5231
  • [37] Coarse-to-fine classification and scene labeling
    Geman, D
    [J]. NONLINEAR ESTIMATION AND CLASSIFICATION, 2003, 171 : 31 - 48
  • [38] Improving Multi-label Emotion Classification via Sentiment Classification with Dual Attention Transfer Network
    Yu, Jianfei
    Marujo, Luis
    Jiang, Jing
    Karuturi, Pradeep
    Brendel, William
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1097 - 1102
  • [39] A Coarse-to-Fine Approach for Medical Hyperspectral Image Classification with Sparse Representation
    Chang, Lan
    Zhang, Mengmeng
    Li, Wei
    [J]. AOPC 2017: OPTICAL SPECTROSCOPY AND IMAGING, 2017, 10461
  • [40] Multi-label legal text classification with BiLSTM and attention
    Enamoto, Liriam
    Santos, Andre R. A. S.
    Maia, Ricardo
    Weigang, Li
    Rocha Filho, Geraldo P.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2022, 68 (04) : 369 - 378