Viewing From Frequency Domain: A DCT-based Information Enhancement Network for Video Person Re-Identification

被引：9

作者：

Liu, Liangchen ^{[1
]}

Yang, Xi ^{[1
]}

Wang, Nannan ^{[1
]}

Gao, Xinbo ^{[2
]}

机构：

[1] Xidian Univ, State Key Lab Integrated Serv Networks, Xian, Peoples R China

[2] Chongqing Univ Posts & Telecommun, Key Lab Image Cognit, Chongqing, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

基金：

中国国家自然科学基金;

关键词：

video-based person re-identification; discrete cosine transform; spatio-temporal feature learning;

D O I：

10.1145/3474085.3475566

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video-based person re-identification (Re-ID) aims to match the target pedestrians under non-overlapping camera system by video tracklets. The key issue of video Re-ID focuses on exploring effective spatio-temporal features. Generally, the spatio-temporal information of a video sequence can be divided into two aspects: the discriminative information in each frame and the shared information over the whole sequence. To make full use of the rich information in video sequences, this paper proposes a Discrete Cosine Transform based Information Enhancement Network (DCT-IEN) to achieve more comprehensive spatio-temporal representation from frequency domain. Inspired by the principle that average pooling is one of the special frequency components in DCT (the lowest frequency component), DCT-IEN first adopts discrete cosine transform to convert the extracted feature maps into frequency domain, thereby retaining more information that embedded in different frequency components. With the help of DCT frequency spectrum, two branches are adopted to learn the final video representation: Frequency Selection Module (FSM) and Lowest Frequency Enhancement Module (LFEM). FSM explores the most discriminative features in each frame by aggregating different frequency components with attention mechanism. LFEM enhances the shared feature over the whole video sequence by frame feature regularization. By fusing these two kinds of features together, DCT-IEN finally achieves comprehensive video representation. We conduct extensive experiments on two widely used datasets. The experimental results verify our idea and demonstrate the effectiveness of DCT-IEN for video-based Re-ID.

引用

页码：227 / 235

页数：9

共 50 条

[1] Frequency Information Disentanglement Network for Video-Based Person Re-Identification
Liu, Liangchen
Yang, Xi
Wang, Nannan
Gao, Xinbo
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4287 - 4298
[2] Local information interaction enhancement network for person re-identification
Du, Haishun
Liu, Panting
Li, Zhaoyang
Zhang, Yonghao
Ye, Yanfang
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (03)
[3] Appearance and Motion Enhancement for Video-Based Person Re-Identification
Li, Shuzhao
Yu, Huimin
Hu, Haoji
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11394 - 11401
[4] Flow-guided feature enhancement network for video-based person re-identification
Gong, Weichao
Yan, Bo
Lin, Chuming
NEUROCOMPUTING, 2020, 383 : 295 - 302
[5] Recurrent Convolutional Network for Video-based Person Re-Identification
McLaughlin, Niall
del Rincon, Jesus Martinez
Miller, Paul
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1325 - 1334
[6] Triplet Attention Network for Video-Based Person Re-Identification
Sun, Rui
Liang, Qili
Yang, Zi
Zhao, Zhenghui
Zhang, Xudong
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (10) : 1775 - 1779
[7] Pyramid and Similarity Based Feature Enhancement Network for Person Re-identification
Chu, Chengguo
Qi, Meibin
Jiang, Jianguo
Chen, Cuiqun
Wu, Jingjing
Journal of Physics: Conference Series, 2021, 1880 (01):
[8] Unsupervised person re-identification based on adaptive information supplementation and foreground enhancement
Wang, Qiang
Huang, Zhihong
Fan, Huijie
Fu, Shengpeng
Tang, Yandong
IET IMAGE PROCESSING, 2024, 18 (14) : 4680 - 4694
[9] SANet: Statistic Attention Network for Video-Based Person Re-Identification
Bai, Shutao
Ma, Bingpeng
Chang, Hong
Huang, Rui
Shan, Shiguang
Chen, Xilin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3866 - 3879
[10] A Duplex Spatiotemporal Filtering Network for Video-based Person Re-identification
Zheng, Chong
Wei, Ping
Zheng, Nanning
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7551 - 7557

← 1 2 3 4 5 →