Spatial-Temporal Graph Convolutional Network for Video-based Person Re-identification

被引:153
|
作者
Yang, Jinrui [1 ,3 ]
Zheng, Wei-Shi [1 ,2 ,3 ]
Yang, Qize [1 ,3 ]
Chen, Ying-Cong [4 ]
Tian, Qi [5 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518005, Peoples R China
[3] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Beijing, Peoples R China
[4] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[5] Huawei Noahs Ark Lab, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR42600.2020.00335
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While video-based person re-identification (Re-ID) has drawn increasing attention and made great progress in recent years, it is still very challenging to effectively overcome the occlusion problem and the visual ambiguity problem for visually similar negative samples. On the other hand, we observe that different frames of a video can provide complementary information for each other, and the structural information of pedestrians can provide extra discriminative cues for appearance features. Thus, modeling the temporal relations of different frames and the spatial relations within a frame has the potential for solving the above problems. In this work, we propose a novel Spatial-Temporal Graph Convolutional Network (STGCN) to solve these problems. The STGCN includes two GCN branches, a spatial one and a temporal one. The spatial branch extracts structural information of a human body. The temporal branch mines discriminative cues from adjacent frames. By jointly optimizing these branches, our model extracts robust spatialtemporal information that is complementary with appearance information. As shown in the experiments, our model achieves state-of-the-art results on MARS and DukeMTMC-VideoReID datasets.
引用
收藏
页码:3286 / 3296
页数:11
相关论文
共 50 条
  • [31] Cross-Modality Spatial-Temporal Transformer for Video-Based Visible-Infrared Person Re-Identification
    Feng, Yujian
    Chen, Feng
    Yu, Jian
    Ji, Yimu
    Wu, Fei
    Liu, Tianliang
    Liu, Shangdong
    Jing, Xiao-Yuan
    Luo, Jiebo
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6582 - 6594
  • [32] AN UNBIASED TEMPORAL REPRESENTATION FOR VIDEO-BASED PERSON RE-IDENTIFICATION
    Zhang, Xiu
    Bhanu, Bir
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 838 - 842
  • [33] COMPLEX SPATIAL-TEMPORAL ATTENTION AGGREGATION FOR VIDEO PERSON RE-IDENTIFICATION
    Ding, Wenjie
    Wei, Xing
    Hong, Xiaopeng
    Gong, Yihong
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2441 - 2445
  • [34] A sparse graph wavelet convolution neural network for video-based person re-identification
    Yao, Yingmao
    Jiang, Xiaoyan
    Fujita, Hamido
    Fang, Zhijun
    [J]. PATTERN RECOGNITION, 2022, 129
  • [35] Graph based Spatial-temporal Fusion for Multi-modal Person Re-identification
    Zhang, Yaobin
    Lv, Jianming
    Liu, Chen
    Cai, Hongmin
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3736 - 3744
  • [36] Triplet Attention Network for Video-Based Person Re-Identification
    Sun, Rui
    Liang, Qili
    Yang, Zi
    Zhao, Zhenghui
    Zhang, Xudong
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (10) : 1775 - 1779
  • [37] Occluded Video-Based Person Re-Identification Based on Spatial- Temporal Trajectory Fusion
    Yun Xiao
    Song Kaili
    Zhang Xiaoguang
    Yuan Xinchao
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (10)
  • [38] Non-local Attentive Temporal Network for Video-based Person Re-Identification
    Rao, Shivansh
    Cao, Peng
    Rahman, Tanzila
    Rochan, Mrigank
    Wang, Yang
    [J]. 2019 16TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2019,
  • [39] MULTI-SCALE SPATIAL-TEMPORAL NETWORK FOR PERSON RE-IDENTIFICATION
    Wang, Zhikang
    He, Lihuo
    Gao, Xinbo
    Huang, Yuanfei
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2052 - 2056
  • [40] Temporal Extension Topology Learning for Video-Based Person Re-identification
    Ning, Jiaqi
    Li, Fei
    Liu, Rujie
    Takeuchi, Shun
    Suzuki, Genta
    [J]. COMPUTER VISION - ACCV 2022 WORKSHOPS, 2023, 13848 : 213 - 225