Multi-label image classification with recurrently learning semantic dependencies

被引:15
|
作者
Chen, Long [1 ]
Wang, Ronggui [1 ]
Yang, Juan [1 ]
Xue, Lixia [1 ]
Hu, Min [1 ]
机构
[1] Hefei Univ Technol, Sch Comp & Informat, Hefei 230601, Anhui, Peoples R China
来源
VISUAL COMPUTER | 2019年 / 35卷 / 10期
基金
中国国家自然科学基金;
关键词
Multi-label; CNN-RNN; Attention; LSTM; Dependencies;
D O I
10.1007/s00371-018-01615-0
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recognizing multi-label images is a significant but challenging task toward high-level visual understanding. Remarkable success has been achieved by applying CNN-RNN design-based models to capture the underlying semantic dependencies of labels and predict the label distributions over the global-level features output by CNNs. However, such global-level features often fuse the information of multiple objects, leading to the difficulty in recognizing small object and capturing the label co-relation. To better solve this problem, in this paper, we propose a novel multi-label image classification framework which is an improvement to the CNN-RNN design pattern. By introducing the attention network module in the CNN-RNN architecture, the objects features of the attention map are separated by the channels which are further send to the LSTM network to capture dependencies and predict labels sequentially. A category-wise max-pooling operation is then performed to integrate these labels into the final prediction. Experimental results on PASCAL2007 and MS-COCO datasets demonstrate that our model can effectively exploit the correlation between tags to improve the classification performance as well as better recognize the small targets.
引用
收藏
页码:1361 / 1371
页数:11
相关论文
共 50 条
  • [1] Multi-label image classification with recurrently learning semantic dependencies
    Long Chen
    Ronggui Wang
    Juan Yang
    Lixia Xue
    Min Hu
    [J]. The Visual Computer, 2019, 35 : 1361 - 1371
  • [2] Learning semantic dependencies with channel correlation for multi-label classification
    Xue, Lixia
    Jiang, Di
    Wang, Ronggui
    Yang, Juan
    Hu, Min
    [J]. VISUAL COMPUTER, 2020, 36 (07): : 1325 - 1335
  • [3] Learning semantic dependencies with channel correlation for multi-label classification
    Lixia Xue
    Di Jiang
    Ronggui Wang
    Juan Yang
    Min Hu
    [J]. The Visual Computer, 2020, 36 : 1325 - 1335
  • [4] Multi-Label Remote Sensing Image Classification with Latent Semantic Dependencies
    Ji, Junchao
    Jing, Weipeng
    Chen, Guangsheng
    Lin, Jingbo
    Song, Houbing
    [J]. REMOTE SENSING, 2020, 12 (07)
  • [5] Deep Semantic Dictionary Learning for Multi-label Image Classification
    Zhou, Fengtao
    Huang, Sheng
    Xing, Yun
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3572 - 3580
  • [6] MULTIMODAL LEARNING FOR MULTI-LABEL IMAGE CLASSIFICATION
    Pang, Yanwei
    Ma, Zhao
    Yuan, Yuan
    Li, Xuelong
    Wang, Kongqiao
    [J]. 2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 1797 - 1800
  • [7] Causal multi-label learning for image classification
    Tian, Yingjie
    Bai, Kunlong
    Yu, Xiaotong
    Zhu, Siyu
    [J]. NEURAL NETWORKS, 2023, 167 : 626 - 637
  • [8] Multi-label Active Learning for Image Classification
    Wu, Jian
    Sheng, Victor S.
    Zhang, Jing
    Zhao, Pengpeng
    Cui, Zhiming
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 5227 - 5231
  • [9] Multi-Label Active Learning with Label Correlation for Image Classification
    Ye, Chen
    Wu, Jian
    Sheng, Victor S.
    Zhao, Pengpeng
    Cui, Zhiming
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 3437 - 3441
  • [10] Multi-label Iterated Learning for Image Classification with Label Ambiguity
    Rajeswar, Sai
    Rodriguez, Pau
    Singhal, Soumye
    Vazquez, David
    Courville, Aaron
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4773 - 4783