Spatial-temporal graph-guided global attention network for video-based person re-identification

被引:0
|
作者
Xiaobao Li
Wen Wang
Qingyong Li
Jiang Zhang
机构
[1] Jiangsu Normal University,School of Computer Science and Technology
[2] Beijing Jiaotong University,Beijing Key Lab of Traffic Data Analysis and Mining
[3] China Academy of Aerospace Aerodynamics,undefined
来源
关键词
Person Re-identification; Global attention learning; Graph; Spatial-temporal;
D O I
暂无
中图分类号
学科分类号
摘要
Global attention learning has been extensively applied in video-based person re-identification due to its superiority in capturing contextual correlations. However, existing global attention learning methods usually adopt the conventional neural network to model non-Euclidean contextual correlations, resulting in a limited representation ability. Inspired by the graph-structure property of the contextual correlations, we propose a spatial-temporal graph-guided global attention network (STG3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^3$$\end{document}A) for video-based person re-identification. STG3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^3$$\end{document}A comprises two graph-guided attention modules to capture the spatial contexts within a frame and temporal contexts across all frames in a sequence for global attention learning. Furthermore, the graphs from both modules are encoded as graph representations, which combine with weighted representations to grasp the spatial-temporal contextual information adequately for video feature learning. To reduce the effect of noisy graph nodes and learn robust graph representations, a graph node attention is developed to trade-off the importance of each graph node, leading to noise-tolerant graph models. Finally, we design a graph-guided fusion scheme to integrate the representations output by these two attentive modules for a more compact video feature. Extensive experiments on MARS and DukeMTMCVideoReID datasets demonstrate the superior performance of the STG3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^3$$\end{document}A.
引用
收藏
相关论文
共 50 条
  • [31] Context Sensing Attention Network for Video-based Person Re-identification
    Wang, Kan
    Ding, Changxing
    Pang, Jianxin
    Xu, Xiangmin
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
  • [32] Video-Based Convolutional Attention for Person Re-Identification
    Zamprogno, Marco
    Passon, Marco
    Martinel, Niki
    Serra, Giuseppe
    Lancioni, Giuseppe
    Micheloni, Christian
    Tasso, Carlo
    Foresti, Gian Luca
    [J]. IMAGE ANALYSIS AND PROCESSING - ICIAP 2019, PT I, 2019, 11751 : 3 - 14
  • [33] Spatial Quality Aware Network for Video-Based Person Re-identification
    Wang, Yujie
    Leng, Biao
    Song, Guanglu
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 34 - 43
  • [34] Spatial-Temporal Person Re-Identification
    Wang, Guangcong
    Lai, Jianhuang
    Huang, Peigen
    Xie, Xiaohua
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8933 - 8940
  • [35] Parallel Attention with Weighted Efficient Network for Video-Based Person Re-Identification
    Yang, Junting
    Yang, Zuliu
    Zhou, Jing
    Zhao, Yong
    Dai, Qifei
    Li, Fuchi
    [J]. 2021 5TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE (ICIAI 2021), 2021, : 133 - 139
  • [36] An Efficient Axial-Attention Network for Video-Based Person Re-Identification
    Zhang, Fuping
    Zhang, Tianzhao
    Sun, Ruoxi
    Huang, Chao
    Wei, Jianming
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1352 - 1356
  • [37] Multi-stage attention network for video-based person re-identification
    Yang, Fan
    Li, Wei
    Liang, Binbin
    Han, Songchen
    Zhu, Xuan
    [J]. IET COMPUTER VISION, 2022, 16 (05) : 445 - 455
  • [38] Intermediary-Guided Bidirectional Spatial–Temporal Aggregation Network for Video-Based Visible-Infrared Person Re-Identification
    Li, Huafeng
    Liu, Minghui
    Hu, Zhanxuan
    Nie, Feiping
    Yu, Zhengtao
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4962 - 4972
  • [39] Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification
    Li, Mengliu
    Xu, Han
    Wang, Jinjun
    Li, Wenpeng
    Sun, Yongli
    [J]. 2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 3365 - 3373
  • [40] Deeply Coupled Convolution-Transformer With Spatial-Temporal Complementary Learning for Video-Based Person Re-Identification
    Liu, Xuehu
    Yu, Chenyang
    Zhang, Pingping
    Lu, Huchuan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 11