What-Where-When Attention Network for video-based person re-identification

被引:6
|
作者
Zhang, Chenrui [1 ,2 ]
Chen, Ping [1 ]
Lei, Tao [3 ]
Wu, Yangxu [1 ]
Meng, Hongying [4 ]
机构
[1] North Univ China, State Key Lab Elect Testing Technol, Taiyuan 030051, Peoples R China
[2] Luliang Univ, Dept Phys, Luliang 033000, Peoples R China
[3] Shaanxi Univ Sci & Technol, Sch Elect Informat & Artificial Intelligence, Xian 710021, Peoples R China
[4] Brunel Univ London, Dept Elect & Elect Engn, Uxbridge, Middx, England
基金
中国国家自然科学基金;
关键词
Person re-identification; What-Where-When Attention; Spatial-temporal feature; Graph attention network; Attribute; Identity;
D O I
10.1016/j.neucom.2021.10.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video-based person re-identification plays a critical role in intelligent video surveillance by learning temporal correlations from consecutive video frames. Most existing methods aim to solve the challenging variations of pose, occlusion, backgrounds and so on by using attention mechanism. They almost all draw attention to the occlusion and learn occlusion-invariant video representations by abandoning the occluded area or frames, while the other areas in these frames contain sufficient spatial information and temporal cues. To overcome these drawbacks, this paper proposes a comprehensive attention mechanism covering what, where, and when to pay attention in the discriminative spatial-temporal feature learning, namely What-Where-When Attention Network (W3AN). Concretely, W3AN designs a spatial attention module to focus on pedestrian identity and obvious attributes by the importance estimating layer (What and Where), and a temporal attention module to calculate the frame-level importance (when), which is embedded into a graph attention network to exploit temporal attention features rather than computing weighted average feature for video frames like existing methods. Moreover, the experiments on three widely-recognized datasets demonstrate the effectiveness of our proposed W3AN model and the discussion of major modules elaborates the contributions of this paper. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:33 / 47
页数:15
相关论文
共 50 条
  • [31] Spatial-temporal aware network for video-based person re-identification
    Jun Wang
    Qi Zhao
    Di Jia
    Ziqing Huang
    Miaohui Zhang
    Xing Ren
    Multimedia Tools and Applications, 2024, 83 : 36355 - 36373
  • [32] Spatial temporal and channel aware network for video-based person re-identification
    Fu, Hui
    Zhang, Ke
    Li, Haoyu
    Wang, Jingyu
    Wang, Zhen
    IMAGE AND VISION COMPUTING, 2022, 118
  • [33] Spatial-temporal aware network for video-based person re-identification
    Wang, Jun
    Zhao, Qi
    Jia, Di
    Huang, Ziqing
    Zhang, Miaohui
    Ren, Xing
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 36355 - 36373
  • [34] Video-based person re-identification with parallel spatial-temporal attention module
    Kong, Jun
    Teng, Zhende
    Jiang, Min
    Huo, Hongtao
    JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (01)
  • [35] Learning Recurrent 3D Attention for Video-Based Person Re-Identification
    Chen, Guangyi
    Lu, Jiwen
    Yang, Ming
    Zhou, Jie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 6963 - 6976
  • [36] Co-segmentation Inspired Attention Networks for Video-based Person Re-identification
    Subramaniam, Arulkumar
    Nambiar, Athira
    Mittal, Anurag
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 562 - 572
  • [37] Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification
    Li, Mengliu
    Xu, Han
    Wang, Jinjun
    Li, Wenpeng
    Sun, Yongli
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 3365 - 3373
  • [38] Attention-guided spatial-temporal graph relation network for video-based person re-identification
    Qi, Yu
    Ge, Hongwei
    Pei, Wenbin
    Liu, Yuxuan
    Hou, Yaqing
    Sun, Liang
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (19): : 14227 - 14241
  • [39] Spatial-temporal graph-guided global attention network for video-based person re-identification
    Xiaobao Li
    Wen Wang
    Qingyong Li
    Jiang Zhang
    Machine Vision and Applications, 2024, 35
  • [40] Spatial-temporal graph-guided global attention network for video-based person re-identification
    Li, Xiaobao
    Wang, Wen
    Li, Qingyong
    Zhang, Jiang
    MACHINE VISION AND APPLICATIONS, 2024, 35 (01)