Global Memory and Local Continuity for Video Object Detection

被引：13

作者：

Han, Liang ^{[1
]}

Yin, Zhaozheng ^{[1
,2
]}

机构：

[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA

[2] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY 11794 USA

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

基金：

美国国家科学基金会;

关键词：

Feature extraction; Object detection; Detectors; Proposals; Target tracking; Signal processing algorithms; Costs; Video object detection; global memory bank; feature aggregation; local continuity; object tracker;

D O I：

10.1109/TMM.2022.3164253

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To deal with the challenges in video object detection (VOD), such as occlusion and motion blur, many state-of-the-art video object detectors adopt a feature aggregation module to encode the long-range contextual information to support the current frame. The main drawbacks of these detectors are three-folds: first, the frame-wise detection slows down the detection speed; second, the frame-wise detection usually ignores the local continuity of the objects in a video, resulting in temporal inconsistent detection; third, the feature aggregation module usually encodes temporal features either from a local video clip or a single video, without exploiting the features in other videos. In this work, we develop an online VOD algorithm, aiming at a balanced high-speed and high-accuracy, by exploiting the global memory and local continuity. In the algorithm, an effective and efficient global memory bank (GMB) is designed to deposit and update object class features, which enables us to exploit the support features in other videos to enhance object features in the current video frames. Besides, to further speed up the detection, we design an object tracker to perform object detection for non-key frames based on the detection results of the key frame by leveraging the local continuity property of the video. Considering the trade-off between detection accuracy and speed, the proposed framework achieves superior performance on the ImageNet VID dataset. Source codes will be released to the public via our GitHub website.

引用

页码：3681 / 3693

页数：13

共 50 条

[31] Local and Global Collaboration for Object Detection Enhancement with Information Redundancy
Lee, Jinseok
Ryu, Junghun
Hong, Sangjin
Cho, We-Duke
AVSS: 2009 6TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, 2009, : 358 - +
[32] Local and Global Information Exchange for Enhancing Object Detection and Tracking
Lee, Jinseok
Cho, Shung Han
Oh, Seong-Jun
Hong, Sangjin
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2012, 6 (05): : 1400 - 1420
[33] Embedded local feature based background modeling for video object detection
Mandal, Manisha
Nanda, Pradipta Kumar
2015 IEEE POWER, COMMUNICATION AND INFORMATION TECHNOLOGY CONFERENCE (PCITC-2015), 2015, : 691 - 696
[34] Object-based video coding by global-to-local motion segmentation
Shamim, A
Robinson, JA
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2002, 12 (12) : 1106 - 1116
[35] GSE: A global-local storage enhanced video object recognition model
Shi, Yuhong
Pan, Hongguang
Jiang, Ze
Zhang, Libin
Miao, Rui
Wang, Zheng
Lei, Xinyu
NEURAL NETWORKS, 2025, 184
[36] 3D Video Object Detection with Learnable Object-Centric Global Optimization
He, Jiawei
Chen, Yuntao
Wang, Naiyan
Zhang, Zhaoxiang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5106 - 5115
[37] A novel memory mechanism for video object detection from indoor mobile robots
Jiyuan Hu
Tao Wang
Yuehua Li
Shiqiang Zhu
Signal, Image and Video Processing, 2021, 15 : 1785 - 1795
[38] Multi-memory video anomaly detection based on scene object distribution
Hongjun Li
Jinyi Chen
Xiaohu Sun
Chaobo Li
Junjie Chen
Multimedia Tools and Applications, 2023, 82 : 35557 - 35583
[39] Multi-memory video anomaly detection based on scene object distribution
Li, Hongjun
Chen, Jinyi
Sun, Xiaohu
Li, Chaobo
Chen, Junjie
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (23) : 35557 - 35583
[40] Motion-Aware Memory Network for Fast Video Salient Object Detection
Zhao, Xing
Liang, Haoran
Li, Peipei
Sun, Guodao
Zhao, Dongdong
Liang, Ronghua
He, Xiaofei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 709 - 721

← 1 2 3 4 5 →