A semantic guidance-based fusion network for multi-label image classification

被引：0

作者：

Wang, Jiuhang ^{[1
,2
]}

Tang, Hongying ^{[1
]}

Luo, Shanshan ^{[1
]}

Yang, Liqi ^{[1
,2
]}

Liu, Shusheng ^{[1
,2
]}

Hong, Aoping ^{[1
,2
]}

Li, Baoqing ^{[1
]}

机构：

[1] Shanghai lnstitute Microsyst & informat Technol, Sci & Technol Microsyst Lab, 1455 Pingcheng Rd, Shanghai 201800, Peoples R China

[2] Univ Chinese Acad Sci, Sch Elect Elect & Commun, 1 Yanqihu East Rd, Beijing 100049, Peoples R China

来源：

PATTERN RECOGNITION LETTERS | 2024年 / 185卷

关键词：

Image spatial correlation; Label semantic correlation; Layered semantic guidance fusion; Multi-label image classification;

D O I：

10.1016/j.patrec.2024.08.020

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-label image classification (MLIC), a fundamental task assigning multiple labels to each image, has been seen notable progress in recent years. Considering simultaneous appearances of objects in the physical world, modeling object correlations is crucial for enhancing classification accuracy. This involves accounting for spatial image feature correlation and label semantic correlation. However, existing methods struggle to establish these correlations due to complex spatial location and label semantic relationships. On the other hand, regarding the fusion of image feature relevance and label semantic relevance, existing methods typically learn a semantic representation in the final CNN layer to combine spatial and label semantic correlations. However, different CNN layers capture features at diverse scales and possess distinct discriminative abilities. To address these issues, in this paper we introduce the Semantic Guidance-Based Fusion Network (SGFN) for MLIC. To model spatial image feature correlation, we leverage the advanced TResNet architecture as the backbone network and employ the Feature Aggregation Module for capturing global spatial correlation. For label semantic correlation, we establish both local and global semantic correlation. We further enrich model features by learning semantic representations across multiple convolutional layers. Our method outperforms current state-of-the-art techniques on PASCAL VOC (2007, 2012) and MS-COCO datasets.

引用

页码：254 / 261

页数：8

共 50 条

[31] Multi-label movie genre classification based on multimodal fusion
Zihui Cai
Hongwei Ding
Jinlu Wu
Ying Xi
Xuemeng Wu
Xiaohui Cui
Multimedia Tools and Applications, 2024, 83 : 36823 - 36840
[32] Multi-label classification of traditional national costume pattern image semantic understanding
Zhao H.-Y.
Zhou W.
Hou X.-G.
Qi G.-L.
Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2020, 28 (03): : 695 - 703
[33] Multiple Semantic Embedding with Graph Convolutional Networks for Multi-Label Image Classification
Zhou, Tong
Feng, Songhe
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 449 - 461
[34] Multi-label semantic sharing based on graph convolutional network for image-to-text retrieval
Ma, Ying
Wang, Meng
Lu, Guangyun
Sun, Yajun
VISUAL COMPUTER, 2024, : 1827 - 1840
[35] Attention-Augmented Memory Network for Image Multi-Label Classification
Zhou, Wei
Hou, Yanke
Chen, Dihu
Hu, Haifeng
Su, Tao
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (03)
[36] Clustering Based Multi-Label Classification for Image Annotation and Retrieval
Nasierding, Gulisong
Tsoumakas, Grigorios
Kouzani, Abbas Z.
2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 4514 - +
[37] Multi-label Garbage Image Classification Based on Deep Learning
Yan, Kang
Si, Wenyu
Hang, Jin
Zhou, Hong
Zhu, Quanyin
2020 19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS ENGINEERING AND SCIENCE (DCABES 2020), 2020, : 150 - 153
[38] Cross-modal fusion for multi-label image classification with attention mechanism
Wang, Yangtao
Xie, Yanzhao
Zeng, Jiangfeng
Wang, Hanpin
Fan, Lisheng
Song, Yufan
Computers and Electrical Engineering, 2022, 101
[39] Cross-modal fusion for multi-label image classification with attention mechanism
Wang, Yangtao
Xie, Yanzhao
Zeng, Jiangfeng
Wang, Hanpin
Fan, Lisheng
Song, Yufan
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 101
[40] Multi-Label Fundus Image Classification Using Attention Mechanisms and Feature Fusion
Li, Zhenwei
Xu, Mengying
Yang, Xiaoli
Han, Yanqi
MICROMACHINES, 2022, 13 (06)

← 1 2 3 4 5 →