CAIBC: Capturing All-round Information Beyond Color for Text-based Person Retrieval

被引:34
|
作者
Wang, Zijie [1 ]
Zhu, Aichun [1 ]
Xue, Jingyi [1 ]
Wan, Xili [1 ]
Liu, Chao [2 ]
Wang, Tian [3 ]
Li, Yifeng [1 ]
机构
[1] Nanjing Tech Univ, Nanjing, Peoples R China
[2] Jinling Inst Technol, Nanjing, Peoples R China
[3] Beihang Univ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Text-based Person Retrieval; Person Re-identification; Cross-modal; Retrieval Multi-branch; Color Information; Mutual Learning;
D O I
10.1145/3503161.3548057
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Given a natural language description, text-based person retrieval aims to identify images of a target person from a large-scale person image database. Existing methods generally face a color over-reliance problem, which means that the models rely heavily on color information when matching cross-modal data. Indeed, color information is an important decision-making accordance for retrieval, but the over-reliance on color would distract the model from other key clues (e.g. texture information, structural information, etc.), and thereby lead to a sub-optimal retrieval performance. To solve this problem, in this paper, we propose to Capture All-round Information Beyond Color (CAIBC) via a jointly optimized multi-branch architecture for text-based person retrieval. CAIBC contains three branches including an RGB branch, a grayscale (GRS) branch and a color (CLR) branch. Besides, with the aim of making full use of all-round information in a balanced and effective way, a mutual learning mechanism is employed to enable the three branches which attend to varied aspects of information to communicate with and learn from each other. Extensive experimental analysis is carried out to evaluate our proposed CAIBC method on the CUHK-PEDES and RSTPReid datasets in both supervised and weakly supervised text-based person retrieval settings, which demonstrates that CAIBC significantly outperforms existing methods and achieves the state-of-the-art performance on all the three tasks.
引用
收藏
页码:5314 / 5322
页数:9
相关论文
共 38 条
  • [1] Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color
    Zhu, Aichun
    Wang, Zijie
    Xue, Jingyi
    Wan, Xili
    Jin, Jing
    Wang, Tian
    Snoussi, Hichem
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 15
  • [2] Learning Semantic Polymorphic Mapping for Text-Based Person Retrieval
    Li, Jiayi
    Jiang, Min
    Kong, Jun
    Tao, Xuefeng
    Luo, Xi
    [J]. IEEE Transactions on Multimedia, 2024, 26 : 10678 - 10691
  • [3] SUM: Serialized Updating and Matching for text-based person retrieval
    Wang, Zijie
    Zhu, Aichun
    Xue, Jingyi
    Jiang, Daihong
    Liu, Chao
    Li, Yifeng
    Hu, Fangqiang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [4] Adaptive Uncertainty-Based Learning for Text-Based Person Retrieval
    Li, Shenshen
    He, Chen
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shen, Heng Tao
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3172 - 3180
  • [5] Text-based interfaces and text-based bibliographic enhancements: Thinking beyond standard bibliographic information (and text)
    Wall, TB
    [J]. PROCEEDINGS OF THE ASIS ANNUAL MEETING, 1996, 33 : 278 - 278
  • [6] DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval
    Zhu, Aichun
    Wang, Zijie
    Li, Yifeng
    Wan, Xili
    Jin, Jing
    Wang, Tian
    Hu, Fangqiang
    Hua, Gang
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 209 - 217
  • [7] Text-based information retrieval using exponentiated gradient descent
    Papka, R
    Callan, JP
    Barto, AG
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 9: PROCEEDINGS OF THE 1996 CONFERENCE, 1997, 9 : 3 - 9
  • [8] Voice-based Information Retrieval - how far are we from the text-based information retrieval ?
    Lee, Lin-shan
    Pan, Yi-cheng
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 26 - 43
  • [9] Causality-Inspired Invariant Representation Learning for Text-Based Person Retrieval
    Liu, Yu
    Qin, Guihe
    Chen, Haipeng
    Cheng, Zhiyong
    Yang, Xun
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 14052 - 14060
  • [10] Modal Complementarity Based on Multimodal Large Language Model for Text-Based Person Retrieval
    Bao, Tong
    Xu, Tong
    Xu, Derong
    Zheng, Zhi
    [J]. WEB AND BIG DATA, APWEB-WAIM 2024, PT I, 2024, 14961 : 264 - 279