Graph-based Consistent Reconstruction and Alignment for imbalanced text-image person re-identification

被引:0
|
作者
Du, Guodong [1 ]
Gong, Tiantian [1 ]
Zhang, Liyan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
Person re-identification; Image-text retrieval; Cross-modal alignment; Modality imbalance; Robustness;
D O I
10.1016/j.eswa.2024.125429
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-image person re-identification (TIReID) has emerged as a versatile approach for retrieving target pedestrians using textual descriptions. However, current TIReID research has been overly idealistic and has overlooked the issues of data incompleteness and modal imbalance in real-world application scenarios. Therefore, in this paper, we propose imbalanced text-image person re-identification (ITIReID) to address these problems. In comparison to TIReID, ITIReID contains a larger proportion of unimodal data, which leads to modal imbalance. The setting of ITIReID is more aligned with real-world scenarios, and studying ITIReID can expand the application scalability of TIReID. We propose a Graph-based Consistent Reconstruction and Alignment framework (GCRA), for ITIReID, which achieves modal balance by completing missing modality features for training implementation. By treating the accessible modality features as graph nodes, GCRA firstly builds an adjacency graph where a new semantic distance that establishes semantic relevance between nodes by comprehensively measuring both intra-modality and inter-modality correlation, serves as the measurement of graph's edges. GCRA further reconstructs the missing nodes - thus re-establishing missing modality features - using existing nodes connected with high semantic relevance. To ensure the reliability and effectiveness of reconstructed features, we propose a proxy-based identity constraint and a reconstruction constraint. In addition, to enable effective semantic alignment using both the reconstructed features and original features, we introduce a cross-modal semantic constraint. Extensive experiments demonstrate that GCRA can effectively handle issues of data incompleteness and modal imbalance, exhibiting its effectiveness and superiority.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Cross-modal feature learning and alignment network for text-image person re-identification
    Huang, Bailiang
    Qi, Xiaolong
    Chen, Bin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 103
  • [2] Multimodal Feature Hierarchical Fusion for Text-Image Person Re-identification
    Li, Jiaxuan
    Huang, Likun
    Zhu, Chuanhu
    Zhang, Song
    Li, Qiang
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 468 - 481
  • [3] Bottom-up color-independent alignment learning for text-image person re-identification
    Du, Guodong
    Zhu, Hanyue
    Zhang, Liyan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
  • [4] Person re-identification by graph-based metric fusion
    Xie, Yi
    Levine, Martin D.
    Yu, Huimin
    ELECTRONICS LETTERS, 2016, 52 (17) : 1447 - 1448
  • [5] Clothes-Changing Person Re-identification Method Based on Text-Image Mutual Learning
    Ge, Bin
    Lu, Yang
    Xia, Chenxing
    Guan, Junming
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2024, 37 (11): : 960 - 973
  • [6] CLIP-Driven Fine-Grained Text-Image Person Re-Identification
    Yan, Shuanglin
    Dong, Neng
    Zhang, Liyan
    Tang, Jinhui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6032 - 6046
  • [7] Spatial enhanced multi-level alignment learning for text-image person re-identification with coupled noisy labels
    Zhao, Jiacheng
    Che, Haojie
    Li, Yongxi
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [8] Text-to-Image Person Re-Identification Based on Multimodal Graph Convolutional Network
    Han, Guang
    Lin, Min
    Li, Ziyang
    Zhao, Haitao
    Kwong, Sam
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6025 - 6036
  • [9] HIERARCHICAL ATTENTION IMAGE-TEXT ALIGNMENT NETWORK FOR PERSON RE-IDENTIFICATION
    Kansal, Kajal
    Subramanyam, A., V
    Wang, Zheng
    Satoh, Shinichi
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [10] Graph-Based Self-Learning for Robust Person Re-identification
    Xian, Yuqiao
    Yang, Jinrui
    Yu, Fufu
    Zhang, Jun
    Sun, Xing
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4778 - 4787