Graph-based Consistent Reconstruction and Alignment for imbalanced text-image person re-identification

被引：0

作者：

Du, Guodong ^{[1
]}

Gong, Tiantian ^{[1
]}

Zhang, Liyan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2025年 / 260卷

基金：

中国国家自然科学基金;

关键词：

Person re-identification; Image-text retrieval; Cross-modal alignment; Modality imbalance; Robustness;

D O I：

10.1016/j.eswa.2024.125429

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text-image person re-identification (TIReID) has emerged as a versatile approach for retrieving target pedestrians using textual descriptions. However, current TIReID research has been overly idealistic and has overlooked the issues of data incompleteness and modal imbalance in real-world application scenarios. Therefore, in this paper, we propose imbalanced text-image person re-identification (ITIReID) to address these problems. In comparison to TIReID, ITIReID contains a larger proportion of unimodal data, which leads to modal imbalance. The setting of ITIReID is more aligned with real-world scenarios, and studying ITIReID can expand the application scalability of TIReID. We propose a Graph-based Consistent Reconstruction and Alignment framework (GCRA), for ITIReID, which achieves modal balance by completing missing modality features for training implementation. By treating the accessible modality features as graph nodes, GCRA firstly builds an adjacency graph where a new semantic distance that establishes semantic relevance between nodes by comprehensively measuring both intra-modality and inter-modality correlation, serves as the measurement of graph's edges. GCRA further reconstructs the missing nodes - thus re-establishing missing modality features - using existing nodes connected with high semantic relevance. To ensure the reliability and effectiveness of reconstructed features, we propose a proxy-based identity constraint and a reconstruction constraint. In addition, to enable effective semantic alignment using both the reconstructed features and original features, we introduce a cross-modal semantic constraint. Extensive experiments demonstrate that GCRA can effectively handle issues of data incompleteness and modal imbalance, exhibiting its effectiveness and superiority.

引用

页数：14

共 50 条

[41] A siamese pedestrian alignment network for person re-identification
Yi Zheng
Yong Zhou
Jiaqi Zhao
Meng Jian
Rui Yao
Bing Liu
Ying Chen
Multimedia Tools and Applications, 2021, 80 : 33951 - 33970
[42] A siamese pedestrian alignment network for person re-identification
Zheng, Yi
Zhou, Yong
Zhao, Jiaqi
Jian, Meng
Yao, Rui
Liu, Bing
Chen, Ying
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (25) : 33951 - 33970
[43] Estimating Image Quality for Person Re-Identification
Chen, Haoyu
Delp, Edward J.
Reibman, Amy R.
IEEE MMSP 2021: 2021 IEEE 23RD INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2021,
[44] Masked Graph Attention Network for Person Re-identification
Bao, Liqiang
Ma, Bingpeng
Chang, Hong
Chen, Xilin
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 1496 - 1505
[45] Adaptive Graph Attention Network in Person Re-Identification
L. D. Duy
P. D. Hung
Pattern Recognition and Image Analysis, 2022, 32 : 384 - 392
[46] Adaptive Graph Attention Network in Person Re-Identification
Duy, L. D.
Hung, P. D.
PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (02) : 384 - 392
[47] Occluded person re-identification based on feature fusion and sparse reconstruction
Fei Gao
Yiming Jin
Yisu Ge
Shufang Lu
Yuanming Zhang
Multimedia Tools and Applications, 2024, 83 : 15061 - 15078
[48] Occluded person re-identification based on feature fusion and sparse reconstruction
Gao, Fei
Jin, Yiming
Ge, Yisu
Lu, Shufang
Zhang, Yuanming
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) : 15061 - 15078
[49] Learning Granularity-Unified Representations for Text-to-Image Person Re-identification
Shao, Zhiyin
Zhang, Xinyu
Fang, Meng
Lin, Zhifeng
Wang, Jian
Ding, Changxing
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5566 - 5574
[50] Learning Comprehensive Representations with Richer Self for Text-to-Image Person Re-Identification
Yan, Shuanglin
Dong, Neng
Liu, Jun
Zhang, Liyan
Tang, Jinhui
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6202 - 6211

← 1 2 3 4 5 →