What-Where-When Attention Network for video-based person re-identification

被引:6
|
作者
Zhang, Chenrui [1 ,2 ]
Chen, Ping [1 ]
Lei, Tao [3 ]
Wu, Yangxu [1 ]
Meng, Hongying [4 ]
机构
[1] North Univ China, State Key Lab Elect Testing Technol, Taiyuan 030051, Peoples R China
[2] Luliang Univ, Dept Phys, Luliang 033000, Peoples R China
[3] Shaanxi Univ Sci & Technol, Sch Elect Informat & Artificial Intelligence, Xian 710021, Peoples R China
[4] Brunel Univ London, Dept Elect & Elect Engn, Uxbridge, Middx, England
基金
中国国家自然科学基金;
关键词
Person re-identification; What-Where-When Attention; Spatial-temporal feature; Graph attention network; Attribute; Identity;
D O I
10.1016/j.neucom.2021.10.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video-based person re-identification plays a critical role in intelligent video surveillance by learning temporal correlations from consecutive video frames. Most existing methods aim to solve the challenging variations of pose, occlusion, backgrounds and so on by using attention mechanism. They almost all draw attention to the occlusion and learn occlusion-invariant video representations by abandoning the occluded area or frames, while the other areas in these frames contain sufficient spatial information and temporal cues. To overcome these drawbacks, this paper proposes a comprehensive attention mechanism covering what, where, and when to pay attention in the discriminative spatial-temporal feature learning, namely What-Where-When Attention Network (W3AN). Concretely, W3AN designs a spatial attention module to focus on pedestrian identity and obvious attributes by the importance estimating layer (What and Where), and a temporal attention module to calculate the frame-level importance (when), which is embedded into a graph attention network to exploit temporal attention features rather than computing weighted average feature for video frames like existing methods. Moreover, the experiments on three widely-recognized datasets demonstrate the effectiveness of our proposed W3AN model and the discussion of major modules elaborates the contributions of this paper. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:33 / 47
页数:15
相关论文
共 50 条
  • [1] Triplet Attention Network for Video-Based Person Re-Identification
    Sun, Rui
    Liang, Qili
    Yang, Zi
    Zhao, Zhenghui
    Zhang, Xudong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (10) : 1775 - 1779
  • [2] SANet: Statistic Attention Network for Video-Based Person Re-Identification
    Bai, Shutao
    Ma, Bingpeng
    Chang, Hong
    Huang, Rui
    Shan, Shiguang
    Chen, Xilin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3866 - 3879
  • [3] Context Sensing Attention Network for Video-based Person Re-identification
    Wang, Kan
    Ding, Changxing
    Pang, Jianxin
    Xu, Xiangmin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
  • [4] Video-Based Convolutional Attention for Person Re-Identification
    Zamprogno, Marco
    Passon, Marco
    Martinel, Niki
    Serra, Giuseppe
    Lancioni, Giuseppe
    Micheloni, Christian
    Tasso, Carlo
    Foresti, Gian Luca
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2019, PT I, 2019, 11751 : 3 - 14
  • [5] Where-and-When to Look: Deep Siamese Attention Networks for Video-Based Person Re-Identification
    Wu, Lin
    Wang, Yang
    Gao, Junbin
    Li, Xue
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (06) : 1412 - 1424
  • [6] Temporal Attention Quality Aware Network for Video-based Person Re-Identification
    Xu, Boqin
    Liu, Changhong
    Xue, Shengjun
    Jiang, Aiwen
    Wang, Shimin
    Ye, Jihua
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [7] Parallel Attention with Weighted Efficient Network for Video-Based Person Re-Identification
    Yang, Junting
    Yang, Zuliu
    Zhou, Jing
    Zhao, Yong
    Dai, Qifei
    Li, Fuchi
    2021 5TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE (ICIAI 2021), 2021, : 133 - 139
  • [8] Temporal-Contextual Attention Network for Video-Based Person Re-identification
    Chen, Di
    Zha, Zheng-Jun
    Liu, Jiawei
    Xie, Hongtao
    Zhang, Yongdong
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 146 - 157
  • [9] An Efficient Axial-Attention Network for Video-Based Person Re-Identification
    Zhang, Fuping
    Zhang, Tianzhao
    Sun, Ruoxi
    Huang, Chao
    Wei, Jianming
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1352 - 1356
  • [10] Multi-stage attention network for video-based person re-identification
    Yang, Fan
    Li, Wei
    Liang, Binbin
    Han, Songchen
    Zhu, Xuan
    IET COMPUTER VISION, 2022, 16 (05) : 445 - 455