Video-Based Person Re-Identification by an End-To-End Learning Architecture with Hybrid Deep Appearance-Temporal Feature

被引:5
|
作者
Sun, Rui [1 ]
Huang, Qiheng [1 ]
Xia, Miaomiao [1 ]
Zhang, Jun [1 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Feicui Rd 420, Hefei 230000, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
person re-identification; end-to-end architecture; appearance-temporal features; Siamese network; pivotal frames;
D O I
10.3390/s18113669
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture with hybrid deep appearance-temporal feature. It can learn the appearance features of pivotal frames, the temporal features, and the independent distance metric of different features. This architecture consists of two-stream deep feature structure and two Siamese networks. For the first-stream structure, we propose the Two-branch Appearance Feature (TAF) sub-structure to obtain the appearance information of persons, and used one of the two Siamese networks to learn the similarity of appearance features of a pairwise person. To utilize the temporal information, we designed the second-stream structure that consisting of the Optical flow Temporal Feature (OTF) sub-structure and another Siamese network, to learn the person's temporal features and the distances of pairwise features. In addition, we select the pivotal frames of video as inputs to the Inception-V3 network on the Two-branch Appearance Feature sub-structure, and employ the salience-learning fusion layer to fuse the learned global and local appearance features. Extensive experimental results on the PRID2011, iLIDS-VID, and Motion Analysis and Re-identification Set (MARS) datasets showed that the respective proposed architectures reached 79%, 59% and 72% at Rank-1 and had advantages over state-of-the-art algorithms. Meanwhile, it also improved the feature representation ability of persons.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] An end-to-end exemplar association for unsupervised person Re-identification
    Wu, Jinlin
    Yang, Yang
    Lei, Zhen
    Wang, Jinqiao
    Li, Stan Z.
    Tiwari, Prayag
    Pandey, Hari Mohan
    NEURAL NETWORKS, 2020, 129 : 43 - 54
  • [22] End-to-end training of CNN ensembles for person re-identification
    Serbetci, Ayse
    Akgul, Yusuf Sinan
    PATTERN RECOGNITION, 2020, 104
  • [23] Person Re-identification with End-to-End Scene Text Recognition
    Kamlesh
    Xu, Pei
    Yang, Yang
    Xu, Yongchao
    COMPUTER VISION, PT III, 2017, 773 : 363 - 374
  • [24] End-to-End Comparative Attention Networks for Person Re-Identification
    Liu, Hao
    Feng, Jiashi
    Qi, Meibin
    Jiang, Jianguo
    Yan, Shuicheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) : 3492 - 3506
  • [25] Person re-identification on lightweight devices: end-to-end approach
    Dang T.L.
    Pham T.H.
    Le D.L.
    Tran X.T.
    Le H.N.
    Nguyen K.H.
    Trinh T.T.N.
    Multimedia Tools and Applications, 2024, 83 (29) : 73569 - 73582
  • [26] Dense Interaction Learning for Video-based Person Re-identification
    He, Tianyu
    Jin, Xin
    Shen, Xu
    Huang, Jianqiang
    Chen, Zhibo
    Hua, Xian-Sheng
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1470 - 1481
  • [27] Video-based Person Re-identification Based on Feature Learning of Valid Regions and Distance Fusion
    Yang, Danmei
    Qi, Meibin
    Wu, Jingjing
    Jiang, Jianguo
    2019 3RD INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2019), 2019, 1229
  • [28] Joint Attentive Spatial-Temporal Feature Aggregation for Video-Based Person Re-Identification
    Chen, Lin
    Yang, Hua
    Gao, Zhiyong
    IEEE ACCESS, 2019, 7 : 41230 - 41240
  • [29] Few-Shot Deep Adversarial Learning for Video-Based Person Re-Identification
    Wu, Lin
    Wang, Yang
    Yin, Hongzhi
    Wang, Meng
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1233 - 1245
  • [30] CONVOLUTIONAL TEMPORAL ATTENTION MODEL FOR VIDEO-BASED PERSON RE-IDENTIFICATION
    Rahman, Tanzila
    Rochan, Mrigank
    Wang, Yang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1102 - 1107