Dual-Memory Feature Aggregation for Video Object Detection

被引：0

作者：

Fan, Diwei ^{[1
,2
,3
]}

Zheng, Huicheng ^{[1
,2
,3
]}

Dang, Jisheng ^{[1
,2
,3
]}

机构：

[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China

[2] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Guangzhou, Peoples R China

[3] Guangdong Prov Key Lab Informat Secur Technol, Guangzhou, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI | 2024年 / 14430卷

基金：

中国国家自然科学基金;

关键词：

video object detection; feature aggregation; temporal information; global memory; local feature cache;

D O I：

10.1007/978-981-99-8537-1_18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent studies on video object detection have shown the advantages of aggregating features across frames to capture temporal information, which can mitigate appearance degradation, such as occlusion, motion blur, and defocus. However, these methods often employ a sliding window or memory queue to store temporal information frame by frame, leading to discarding features of earlier frames over time. To address this, we propose a dual-memory feature aggregation framework (DMFA). DMFA simultaneously constructs a local feature cache and a global feature memory in a feature-wise updating way at different granularities, i.e., pixel level and proposal level. This approach can partially preserve key features across frames. The local feature cache stores the spatio-temporal contexts from nearby frames to boost the localization capacity, while the global feature memory enhances semantic feature representation by capturing temporal information from all previous frames. Moreover, we introduce contrastive learning to improve the discriminability of temporal features, resulting in more accurate proposal-level feature aggregation. Extensive experiments demonstrate that our method achieves state-of-the-art performance on the ImageNet VID benchmark.

引用

页码：220 / 232

页数：13

共 50 条

[31] Incremental Dual-memory LSTM in Land Cover Prediction
Jia, Xiaowei
Khandelwal, Ankush
Nayak, Guruprasad
Gerber, James
Carlson, Kimberly
West, Paul
Kumar, Vipin
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 867 - 876
[32] Feature Aggregation and Propagation Network for Camouflaged Object Detection
Zhou, Tao
Zhou, Yi
Gong, Chen
Yang, Jian
Zhang, Yu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7036 - 7047
[33] Shape-Guided Dual-Memory Learning for 3D Anomaly Detection
Chu, Yu-Min
Liu, Chieh
Hsieh, Ting-I
Chen, Hwann-Tzong
Liu, Tyng-Luh
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[34] FIANET: VIDEO OBJECT DETECTION VIA JOINT FEATURE-LEVEL AND INSTANCE-LEVEL AGGREGATION
Wang, Zhengshuai
Li, Yali
Wang, Shengjin
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[35] Optimized RT-DETR for accurate and efficient video object detection via decoupled feature aggregation
Chen, Hao
Huang, Wu
Zhang, Tao
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (01)
[36] Video object detection algorithm based on multi-level feature aggregation under mixed sampler
Qin S.
Gai S.
Da F.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (01): : 10 - 19
[37] Identity-Consistent Aggregation for Video Object Detection
Deng, Chaorui
Chen, Da
Wu, Qi
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13388 - 13398
[38] Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
Miao, Jiaxu
Wei, Yunchao
Yang, Yi
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10363 - 10372
[39] Sequence Level Semantics Aggregation for Video Object Detection
Wu, Haiping
Chen, Yuntao
Wang, Naiyan
Zhang, Zhaoxiang
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9216 - 9224
[40] Temporal feature enhancement network with external memory for live-stream video object detection
Fujitake, Masato
Sugimoto, Akihiro
PATTERN RECOGNITION, 2022, 131

← 1 2 3 4 5 →