Global Memory and Local Continuity for Video Object Detection

被引：13

作者：

Han, Liang ^{[1
]}

Yin, Zhaozheng ^{[1
,2
]}

机构：

[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA

[2] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY 11794 USA

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

基金：

美国国家科学基金会;

关键词：

Feature extraction; Object detection; Detectors; Proposals; Target tracking; Signal processing algorithms; Costs; Video object detection; global memory bank; feature aggregation; local continuity; object tracker;

D O I：

10.1109/TMM.2022.3164253

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To deal with the challenges in video object detection (VOD), such as occlusion and motion blur, many state-of-the-art video object detectors adopt a feature aggregation module to encode the long-range contextual information to support the current frame. The main drawbacks of these detectors are three-folds: first, the frame-wise detection slows down the detection speed; second, the frame-wise detection usually ignores the local continuity of the objects in a video, resulting in temporal inconsistent detection; third, the feature aggregation module usually encodes temporal features either from a local video clip or a single video, without exploiting the features in other videos. In this work, we develop an online VOD algorithm, aiming at a balanced high-speed and high-accuracy, by exploiting the global memory and local continuity. In the algorithm, an effective and efficient global memory bank (GMB) is designed to deposit and update object class features, which enables us to exploit the support features in other videos to enhance object features in the current video frames. Besides, to further speed up the detection, we design an object tracker to perform object detection for non-key frames based on the detection results of the key frame by leveraging the local continuity property of the video. Considering the trade-off between detection accuracy and speed, the proposed framework achieves superior performance on the ImageNet VID dataset. Source codes will be released to the public via our GitHub website.

引用

页码：3681 / 3693

页数：13

共 50 条

[1] Memory Enhanced Global-Local Aggregation for Video Object Detection
Chen, Yihong
Cao, Yue
Hu, Han
Wang, Liwei
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10334 - 10343
[2] Object Guided External Memory Network for Video Object Detection
Deng, Hanming
Hua, Yang
Song, Tao
Zhang, Zongpu
Xue, Zhengui
Ma, Ruhui
Robertson, Neil
Guan, Haibing
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6677 - 6686
[3] Local track to detect for video object detection
Zeng, Biao
Zhong, Shan
Zhou, Lifan
Wang, Zhaohui
Gong, Shengrong
INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2021, 67 (2-3) : 119 - 128
[4] Memory Maps for Video Object Detection and Tracking on UAVs
Kiefer, Benjamin
Quan, Yitong
Zell, Andreas
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 3040 - 3047
[5] Fusion of global and local information for object detection
Garg, A
Agarwal, S
Huang, TS
16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL III, PROCEEDINGS, 2002, : 723 - 726
[6] Global Spectral Filter Memory Network for Video Object Segmentation
Liu, Yong
Yu, Ran
Wang, Jiahao
Zhao, Xinyuan
Wang, Yitong
Tang, Yansong
Yang, Yujiu
COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 648 - 665
[7] Local Memory Read-and-Comparator for Video Object Segmentation
Heo, Yuk
Koh, Yeong Jun
Kim, Chang-Su
IEEE ACCESS, 2022, 10 : 90004 - 90016
[8] Progressive Sparse Local Attention for Video Object Detection
Guo, Chaoxu
Fan, Bin
Gu, Jie
Zhang, Qian
Xiang, Shiming
Prinet, Veronique
Pan, Chunhong
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3908 - 3917
[9] Local Attention Sequence Model for Video Object Detection
Li, Zhenhui
Zhuang, Xiaoping
Wang, Haibo
Nie, Yong
Tang, Jianzhong
APPLIED SCIENCES-BASEL, 2021, 11 (10):
[10] Global and local sensitivity guided key salient object re-augmentation for video saliency detection
Wang, Zheng
Zhou, Ziqi
Lu, Huchuan
Jiang, Jianmin
PATTERN RECOGNITION, 2020, 103

← 1 2 3 4 5 →