TriMatch: Triple Matching for Text-to-Image Person Re-Identification

被引:0
|
作者
Yan, Shuanglin [1 ]
Dong, Neng [1 ]
Li, Shuang [2 ]
Li, Huafeng [3 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
[3] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming 650500, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Visualization; Text to image; Accuracy; Identification of persons; Vectors; Tuning; Transforms; Training; Head; Text-to-image person re-identification; heterogeneous gaps; cross-modal matching; unimodal matching; NETWORK;
D O I
10.1109/LSP.2025.3534689
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text-to-image person re-identification (TIReID) is a cross-modal retrieval task that aims to retrieve target person images based on a given text description. Existing methods primarily focus on mining the semantic associations across modalities, relying on the matching between heterogeneous features for retrieval. However, due to the inherent heterogeneous gaps between modalities, it is challenging to establish precise semantic associations, particularly in fine-grained correspondences, often leading to incorrect retrieval results. To address this issue, this letter proposes an innovative Triple Matching (TriMatch) framework that integrates cross-modal (image-text) matching and unimodal (image-image, text-text) matching for high-precision person retrieval. The framework introduces a generation task that performs cross-modal (image-to-text and text-to-image) feature generation and intra-modal feature alig achieve unimodal matching. By incorporating the generation task, TriMatch considers not only the semantic correlations between modalities but also the semantic consistency within single modalities, thereby effectively enhancing the accuracy of target person retrieval. Extensive experiments on multiple datasets demonstrate the superiority of TriMatch over existing methods.
引用
收藏
页码:806 / 810
页数:5
相关论文
共 50 条
  • [1] Cross-Modal Dual Matching and Comparison for Text-to-Image Person Re-identification
    Cao, Lin
    Sun, Wenwen
    Guo, Yanan
    Wang, Shoujing
    Lv, Boqian
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 246 - 259
  • [2] Text-to-Image Person Re-Identification Based on Multimodal Graph Convolutional Network
    Han, Guang
    Lin, Min
    Li, Ziyang
    Zhao, Haitao
    Kwong, Sam
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6025 - 6036
  • [3] Learning Granularity-Unified Representations for Text-to-Image Person Re-identification
    Shao, Zhiyin
    Zhang, Xinyu
    Fang, Meng
    Lin, Zhifeng
    Wang, Jian
    Ding, Changxing
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5566 - 5574
  • [4] Learning Comprehensive Representations with Richer Self for Text-to-Image Person Re-Identification
    Yan, Shuanglin
    Dong, Neng
    Liu, Jun
    Zhang, Liyan
    Tang, Jinhui
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6202 - 6211
  • [5] Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification
    Shao, Zhiyin
    Zhang, Xinyu
    Ding, Changxing
    Wang, Jian
    Wang, Jingdong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11140 - 11150
  • [6] Exploring granularity-associated invariance features for text-to-image person re-identification
    Shao, Chenglong
    Si, Tongzhen
    Yang, Xiaohui
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [7] CMLFA: cross-modal latent feature aligning for text-to-image person re-identification
    Yang, Xiaofan
    Wang, Jianming
    Sun, Yukuan
    Duan, Xiaojie
    JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
  • [8] Instance-level feature bias calibration learning for text-to-image person re-identification
    Gou, Yifeng
    Li, Ziqiang
    Zhang, Junyin
    Wang, Yunnan
    Ge, Yongxin
    KNOWLEDGE-BASED SYSTEMS, 2025, 315
  • [9] Multi-granularity confidence learning for unsupervised text-to-image person re-identification with incomplete modality
    Li, Yongxiang
    Peng, Dezhong
    Huang, Haixiao
    Liu, Yizhi
    Zheng, Huiming
    Liu, Zheng
    KNOWLEDGE-BASED SYSTEMS, 2025, 315
  • [10] Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification
    Zhao, Zhiwei
    Liu, Bin
    Lu, Yan
    Chu, Qi
    Yu, Nenghai
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7534 - 7542