General Multi-label Image Classification with Transformers

被引:178
|
作者
Lanchantin, Jack [1 ]
Wang, Tianlu [1 ]
Ordonez, Vicente [1 ]
Qi, Yanjun [1 ]
机构
[1] Univ Virginia, Charlottesville, VA 22903 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR46437.2021.01621
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. In this work we propose the Classification Transformer (C-Tran), a general framework for multi-label image classification that leverages Transformers to exploit the complex dependencies among visual features and labels. Our approach consists of a Transformer encoder trained to predict a set of target labels given an input set of masked labels, and visual features from a convolutional neural network. A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels as positive, negative, or unknown during training. Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome. Moreover, because our model explicitly represents the label state during training, it is more general by allowing us to produce improved results for images with partial or extra label annotations during inference. We demonstrate this additional capability in the COCO, Visual Genome, News-500, and CUB image datasets.
引用
收藏
页码:16473 / 16483
页数:11
相关论文
共 50 条
  • [31] Reinforced Multi-Label Image Classification by Exploring Curriculum
    He, Shiyi
    Xu, Chang
    Guo, Tianyu
    Xu, Chao
    Tao, Dacheng
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3183 - 3190
  • [32] Multi-label SVM active learning for image classification
    Li, XC
    Wang, L
    Sung, E
    [J]. ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 2207 - 2210
  • [33] Mineral Identification Based on Multi-Label Image Classification
    Wu, Baokun
    Ji, Xiaohui
    He, Mingyue
    Yang, Mei
    Zhang, Zhaochong
    Chen, Yan
    Wang, Yuzhu
    Zheng, Xinqi
    [J]. MINERALS, 2022, 12 (11)
  • [34] Improving Pairwise Ranking for Multi-label Image Classification
    Li, Yuncheng
    Song, Yale
    Luo, Jiebo
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1837 - 1845
  • [35] THE UTILIZATION OF MULTI-LABEL SAMPLES FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Hao, Qiaobo
    Li, Shutao
    Kang, Xudong
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 2981 - 2984
  • [36] Explainable Noisy Label Flipping for Multi-Label Fashion Image Classification
    Ferreira, Beatriz Quintino
    Costeira, Joao P.
    Gomes, Joao P.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3911 - 3915
  • [37] MULTI-LABEL CLASSIFICATION WITH SINGLE POSITIVE LABEL FOR REMOTE SENSING IMAGE
    Fujii, Keigo
    Iwasaki, Akira
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5870 - 5873
  • [38] Label Enhancement Manifold Learning Algorithm for Multi-label Image Classification
    Tan, Chao
    Ji, Genlin
    [J]. 2020 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD 2020), 2020, : 96 - 102
  • [39] Untargeted Attack on Targeted-label for Multi-label Image Classification
    Lin, Yangfei
    Qiao, Peng
    Dou, Yong
    [J]. TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
  • [40] Active learning with label correlation exploration for multi-label image classification
    Wu, Jian
    Ye, Chen
    Sheng, Victor S.
    Zhang, Jing
    Zhao, Pengpeng
    Cui, Zhiming
    [J]. IET COMPUTER VISION, 2017, 11 (07) : 577 - 584