Joint discriminative representation learning for end-to-end person search

被引:17
|
作者
Zhang, Pengcheng [1 ]
Yu, Xiaohan [2 ,3 ]
Bai, Xiao [1 ]
Wang, Chen [1 ]
Zheng, Jin [1 ]
Ning, Xin [4 ]
机构
[1] Beihang Univ, Jiangxi Res Inst, Sch Comp Sci & Engn, State Key Lab Software Dev Environm, Beijing, Peoples R China
[2] Macquarie Univ, Sch Comp, Sydney, Australia
[3] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Australia
[4] Chinese Acad Sci, Inst Semicond, Beijing, Peoples R China
基金
美国国家科学基金会;
关键词
Person search; Person re-identification; Part segmentation; Batch sampling; NETWORK;
D O I
10.1016/j.patcog.2023.110053
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Person search simultaneously detects and retrieves a query person from uncropped scene images. Existing methods are either two-step or end-to-end. The former employs two standalone models for the two sub-tasks, while the latter conducts person search with a unified model. Despite encouraging progress, most existing end-to-end methods focus on balancing the model between detection and retrieval sub-tasks, while ignoring to enhance the learned representation for retrieval, which leads to inferior accuracy to two-step approaches. To that end, we propose a novel hierarchical framework that jointly optimizes instance-aware and part -aware embedding to enable discriminative representation learning. Specifically, we develop a region-of-interest cosegment (ROICoseg) module that captures part-aware information without requiring extra annotations to enable fine-grained discriminative representation. On top of that, a Contextual Instance Batch Sampling (CIBS) method is introduced to effectively employ contextual information for constructing training batches, thus facilitating effective instance-aware representation learning. We further introduce the first cross-door person search dataset (CDPS) that retrieves a target person in outdoor cameras with an indoor captured image or vice versa. Extensive experiments show that our proposed model achieves competitive performance on CUHK-SYSU and outperforms state-of-the-art end-to-end methods on the more challenging PRW and CDPS.1
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Spatial and temporal learning representation for end-to-end recording device identification
    Zeng, Chunyan
    Zhu, Dongliang
    Wang, Zhifeng
    Wu, Minghu
    Xiong, Wei
    Zhao, Nan
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2021, 2021 (01)
  • [32] END-TO-END BINARY REPRESENTATION LEARNING VIA DIRECT BINARY EMBEDDING
    Liu, Liu
    Rahimpour, Alireza
    Taalimi, Ali
    Qi, Hairong
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1257 - 1261
  • [33] End-to-end Reinforcement Learning of Robotic Manipulation with Robust Keypoints Representation
    Wang, Tianying
    Puang, En Yen
    Lee, Marcus
    Jing, Wei
    Wu, Yan
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 590 - 597
  • [34] An End-to-End Multiplex Graph Neural Network for Graph Representation Learning
    Liang, Yanyan
    Zhang, Yanfeng
    Gao, Dechao
    Xu, Qian
    IEEE ACCESS, 2021, 9 : 58861 - 58869
  • [35] End-to-End Efficient Representation Learning via Cascading Combinatorial Optimization
    Jeong, Yeonwoo
    Kim, Yoonsung
    Song, Hyun Oh
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11371 - 11379
  • [36] End-to-end Video-level Representation Learning for Action Recognition
    Zhu, Jiagang
    Zhu, Zheng
    Zou, Wei
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 645 - 650
  • [37] End-to-end Autonomous Driving Perception with Sequential Latent Representation Learning
    Chen, Jianyu
    Xu, Zhuo
    Tomizuka, Masayoshi
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 1999 - 2006
  • [38] An End-to-End Joint Unsupervised Learning of Deep Model and Pseudo-Classes for Remote Sensing Scene Representation
    Gong, Zhigiang
    Zhong, Ping
    Hu, Weidong
    Liu, Fang
    Hui, Bingwei
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [39] Influencer Loss: End-to-end Geometric Representation Learning for Track Reconstruction
    Murnane, Daniel
    26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
  • [40] Improved Model Structure with Cosine Margin OIM Loss for End-to-End Person Search
    Chen, Haoran
    Zhu, Minghua
    Cai, Xuesong
    Luo, Jufeng
    Qiu, Yunzhou
    MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 419 - 430