Joint discriminative representation learning for end-to-end person search

被引:17
|
作者
Zhang, Pengcheng [1 ]
Yu, Xiaohan [2 ,3 ]
Bai, Xiao [1 ]
Wang, Chen [1 ]
Zheng, Jin [1 ]
Ning, Xin [4 ]
机构
[1] Beihang Univ, Jiangxi Res Inst, Sch Comp Sci & Engn, State Key Lab Software Dev Environm, Beijing, Peoples R China
[2] Macquarie Univ, Sch Comp, Sydney, Australia
[3] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Australia
[4] Chinese Acad Sci, Inst Semicond, Beijing, Peoples R China
基金
美国国家科学基金会;
关键词
Person search; Person re-identification; Part segmentation; Batch sampling; NETWORK;
D O I
10.1016/j.patcog.2023.110053
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Person search simultaneously detects and retrieves a query person from uncropped scene images. Existing methods are either two-step or end-to-end. The former employs two standalone models for the two sub-tasks, while the latter conducts person search with a unified model. Despite encouraging progress, most existing end-to-end methods focus on balancing the model between detection and retrieval sub-tasks, while ignoring to enhance the learned representation for retrieval, which leads to inferior accuracy to two-step approaches. To that end, we propose a novel hierarchical framework that jointly optimizes instance-aware and part -aware embedding to enable discriminative representation learning. Specifically, we develop a region-of-interest cosegment (ROICoseg) module that captures part-aware information without requiring extra annotations to enable fine-grained discriminative representation. On top of that, a Contextual Instance Batch Sampling (CIBS) method is introduced to effectively employ contextual information for constructing training batches, thus facilitating effective instance-aware representation learning. We further introduce the first cross-door person search dataset (CDPS) that retrieves a target person in outdoor cameras with an indoor captured image or vice versa. Extensive experiments show that our proposed model achieves competitive performance on CUHK-SYSU and outperforms state-of-the-art end-to-end methods on the more challenging PRW and CDPS.1
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Learning Scene-Pedestrian Graph for End-to-End Person Search
    Song, Zifan
    Zhao, Cairong
    Hu, Guosheng
    Miao, Duoqian
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (02) : 2979 - 2990
  • [2] Sequential Transformer for End-to-End Person Search
    Chen, Long
    Xu, Jinhua
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 226 - 238
  • [3] Cascade Transformers for End-to-End Person Search
    Yu, Rui
    Du, Dawei
    LaLonde, Rodney
    Davila, Daniel
    Funk, Christopher
    Hoogs, Anthony
    Clipp, Brian
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7257 - 7266
  • [4] End-to-End Thorough Body Perception for Person Search
    Tian, Kun
    Huang, Houjing
    Ye, Yun
    Li, Shiyu
    Lin, Jinbin
    Huang, Guan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12079 - 12086
  • [5] Sequential End-to-end Network for Efficient Person Search
    Li, Zhengjia
    Miao, Duoqian
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2011 - 2019
  • [6] Segmentation mask guided end-to-end person search
    Zheng, Dingyuan
    Xiao, Jimin
    Huang, Kaizhu
    Zhao, Yao
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 86
  • [7] Query-guided End-to-End Person Search
    Munjal, Bharti
    Amin, Sikandar
    Tombari, Federico
    Galasso, Fabio
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 811 - 820
  • [8] Diverse Knowledge Distillation for End-to-End Person Search
    Zhang, Xinyu
    Wang, Xinlong
    Bian, Jia-Wang
    Shen, Chunhua
    You, Mingyu
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3412 - 3420
  • [9] END-TO-END PERSON SEARCH SEQUENTIALLY TRAINED ON AGGREGATED DATASET
    Loesch, Angelique
    Rabarisoa, Jaonary
    Audigier, Romaric
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4574 - 4578
  • [10] End-to-End Learning of Motion Representation for Video Understanding
    Fan, Lijie
    Huang, Wenbing
    Gan, Chuang
    Ermon, Stefano
    Gong, Boqing
    Huang, Junzhou
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6016 - 6025