DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval

被引:70
|
作者
Zhu, Aichun [1 ]
Wang, Zijie [1 ]
Li, Yifeng [1 ]
Wan, Xili [1 ]
Jin, Jing [1 ]
Wang, Tian [2 ]
Hu, Fangqiang [1 ]
Hua, Gang [3 ]
机构
[1] Nanjing Tech Univ, Nanjing, Peoples R China
[2] Beihang Univ, Beijing, Peoples R China
[3] China Univ Min & Technol, Xuzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
person retrieval; text-based person re-identification; cross-modal retrieval; surroundings-person separation;
D O I
10.1145/3474085.3475369
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many previous methods on text-based person retrieval tasks are devoted to learning a latent common space mapping, with the purpose of extracting modality-invariant features from both visual and textual modality. Nevertheless, due to the complexity of high-dimensional data, the unconstrained mapping paradigms are not able to properly catch discriminative clues about the corresponding person while drop the misaligned information. Intuitively, the information contained in visual data can be divided into person information (PI) and surroundings information (SI), which are mutually exclusive from each other. To this end, we propose a novel Deep Surroundings-person Separation Learning (DSSL) model in this paper to effectively extract and match person information, and hence achieve a superior retrieval accuracy. A surroundings-person separation and fusion mechanism plays the key role to realize an accurate and effective surroundings-person separation under a mutually exclusion constraint. In order to adequately utilize multimodal and multi-granular information for a higher retrieval accuracy, five diverse alignment paradigms are adopted. Extensive experiments are carried out to evaluate the proposed DSSL on CUHK-PEDES, which is currently the only accessible dataset for text-base person retrieval task. DSSL achieves the state-of-the-art performance on CUHK-PEDES. To properly evaluate our proposed DSSL in the real scenarios, a Real Scenarios Text-based Person Reidentification (RSTPReid) dataset is constructed to benefit future research on text-based person retrieval, which will be publicly available.
引用
收藏
页码:209 / 217
页数:9
相关论文
共 50 条
  • [1] Learning Semantic Polymorphic Mapping for Text-Based Person Retrieval
    Li, Jiayi
    Jiang, Min
    Kong, Jun
    Tao, Xuefeng
    Luo, Xi
    [J]. IEEE Transactions on Multimedia, 2024, 26 : 10678 - 10691
  • [2] DCEL: Deep Cross-modal Evidential Learning for Text-Based Person Retrieval
    Li, Shenshen
    Xu, Xing
    Yang, Yang
    Shen, Fumin
    Mo, Yijun
    Li, Yujie
    Shen, Heng Tao
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6292 - 6300
  • [3] Adaptive Uncertainty-Based Learning for Text-Based Person Retrieval
    Li, Shenshen
    He, Chen
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shen, Heng Tao
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3172 - 3180
  • [4] Causality-Inspired Invariant Representation Learning for Text-Based Person Retrieval
    Liu, Yu
    Qin, Guihe
    Chen, Haipeng
    Cheng, Zhiyong
    Yang, Xun
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 14052 - 14060
  • [5] SUM: Serialized Updating and Matching for text-based person retrieval
    Wang, Zijie
    Zhu, Aichun
    Xue, Jingyi
    Jiang, Daihong
    Liu, Chao
    Li, Yifeng
    Hu, Fangqiang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [6] Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization
    Li, Shenshen
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    [J]. PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 307 - 315
  • [7] Fine-grained Semantics-aware Representation Learning for Text-based Person Retrieval
    Wang, Di
    Yan, Feng
    Wang, Yifeng
    Zhao, Lin
    Liang, Xiao
    Zhong, Haodi
    Zhang, Ronghua
    [J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 92 - 100
  • [8] Pedestrian-specific Bipartite-aware Similarity Learning for Text-based Person Retrieval
    Shen, Fei
    Shu, Xiangbo
    Du, Xiaoyu
    Tang, Jinhui
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8922 - 8931
  • [9] Text-based Person Search via Virtual Attribute Learning
    Wang C.-J.
    Su J.-W.
    Luo Z.-M.
    Cao D.-L.
    Lin Y.-J.
    Li S.-Z.
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2035 - 2050
  • [10] Conditional Feature Learning Based Transformer for Text-Based Person Search
    Gao, Chenyang
    Cai, Guanyu
    Jiang, Xinyang
    Zheng, Feng
    Zhang, Jun
    Gong, Yifei
    Lin, Fangzhou
    Sun, Xing
    Bai, Xiang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6097 - 6108