A semantic guidance-based fusion network for multi-label image classification

被引:0
|
作者
Wang, Jiuhang [1 ,2 ]
Tang, Hongying [1 ]
Luo, Shanshan [1 ]
Yang, Liqi [1 ,2 ]
Liu, Shusheng [1 ,2 ]
Hong, Aoping [1 ,2 ]
Li, Baoqing [1 ]
机构
[1] Shanghai lnstitute Microsyst & informat Technol, Sci & Technol Microsyst Lab, 1455 Pingcheng Rd, Shanghai 201800, Peoples R China
[2] Univ Chinese Acad Sci, Sch Elect Elect & Commun, 1 Yanqihu East Rd, Beijing 100049, Peoples R China
关键词
Image spatial correlation; Label semantic correlation; Layered semantic guidance fusion; Multi-label image classification;
D O I
10.1016/j.patrec.2024.08.020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label image classification (MLIC), a fundamental task assigning multiple labels to each image, has been seen notable progress in recent years. Considering simultaneous appearances of objects in the physical world, modeling object correlations is crucial for enhancing classification accuracy. This involves accounting for spatial image feature correlation and label semantic correlation. However, existing methods struggle to establish these correlations due to complex spatial location and label semantic relationships. On the other hand, regarding the fusion of image feature relevance and label semantic relevance, existing methods typically learn a semantic representation in the final CNN layer to combine spatial and label semantic correlations. However, different CNN layers capture features at diverse scales and possess distinct discriminative abilities. To address these issues, in this paper we introduce the Semantic Guidance-Based Fusion Network (SGFN) for MLIC. To model spatial image feature correlation, we leverage the advanced TResNet architecture as the backbone network and employ the Feature Aggregation Module for capturing global spatial correlation. For label semantic correlation, we establish both local and global semantic correlation. We further enrich model features by learning semantic representations across multiple convolutional layers. Our method outperforms current state-of-the-art techniques on PASCAL VOC (2007, 2012) and MS-COCO datasets.
引用
收藏
页码:254 / 261
页数:8
相关论文
共 50 条
  • [21] Feature learning network with transformer for multi-label image classification
    Zhou, Wei
    Dou, Peng
    Su, Tao
    Hu, Haifeng
    Zheng, Zhijie
    PATTERN RECOGNITION, 2023, 136
  • [22] Improve Multi-Label Image Classification Using Adversarial Network
    Li Z.
    Zhou T.
    Zhang C.
    Ma H.
    Zhao W.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (01): : 16 - 26
  • [23] Multi-label text classification model based on semantic embedding
    Yan Danfeng
    Ke Nan
    Gu Chao
    Cui Jianfei
    Ding Yiqi
    The Journal of China Universities of Posts and Telecommunications, 2019, 26 (01) : 95 - 104
  • [24] Graph Attention Transformer Network for Multi-label Image Classification
    Yuan, Jin
    Chen, Shikai
    Zhang, Yao
    Shi, Zhongchao
    Geng, Xin
    Fan, Jianping
    Rui, Yong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
  • [25] Multi-Label Text Classification Based on Shared Semantic Space
    Sun, Kun
    Qin, Bowen
    Sang, Jitao
    Yu, Jian
    Computer Engineering and Applications, 2023, 59 (12): : 100 - 105
  • [26] Ontology based Classification for Multi-label Image Annotation
    Reshma, Ismat Ara
    Ullah, Md Zia
    Aono, Masaki
    2014 INTERNATIONAL CONFERENCE OF ADVANCED INFORMATICS: CONCEPT, THEORY AND APPLICATION (ICAICTA), 2014, : 226 - 231
  • [27] Mineral Identification Based on Multi-Label Image Classification
    Wu, Baokun
    Ji, Xiaohui
    He, Mingyue
    Yang, Mei
    Zhang, Zhaochong
    Chen, Yan
    Wang, Yuzhu
    Zheng, Xinqi
    MINERALS, 2022, 12 (11)
  • [28] Real-Time Image Semantic Segmentation Based on Attention Mechanism and Multi-Label Classification
    Gao X.
    Li C.
    An J.
    Li, Chungeng (li_chungeng@dlmu.edu.cn), 1600, Institute of Computing Technology (33): : 59 - 67
  • [29] Video Representation Fusion Network For Multi-Label Movie Genre Classification
    Bi, Tianyu
    Jarnikov, Dmitri
    Lukkien, Johan
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9386 - 9391
  • [30] Multi-label movie genre classification based on multimodal fusion
    Cai, Zihui
    Ding, Hongwei
    Wu, Jinlu
    Xi, Ying
    Wu, Xuemeng
    Cui, Xiaohui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 36823 - 36840