Global Memory and Local Continuity for Video Object Detection

被引：13

作者：

Han, Liang ^{[1
]}

Yin, Zhaozheng ^{[1
,2
]}

机构：

[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA

[2] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY 11794 USA

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

基金：

美国国家科学基金会;

关键词：

Feature extraction; Object detection; Detectors; Proposals; Target tracking; Signal processing algorithms; Costs; Video object detection; global memory bank; feature aggregation; local continuity; object tracker;

D O I：

10.1109/TMM.2022.3164253

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To deal with the challenges in video object detection (VOD), such as occlusion and motion blur, many state-of-the-art video object detectors adopt a feature aggregation module to encode the long-range contextual information to support the current frame. The main drawbacks of these detectors are three-folds: first, the frame-wise detection slows down the detection speed; second, the frame-wise detection usually ignores the local continuity of the objects in a video, resulting in temporal inconsistent detection; third, the feature aggregation module usually encodes temporal features either from a local video clip or a single video, without exploiting the features in other videos. In this work, we develop an online VOD algorithm, aiming at a balanced high-speed and high-accuracy, by exploiting the global memory and local continuity. In the algorithm, an effective and efficient global memory bank (GMB) is designed to deposit and update object class features, which enables us to exploit the support features in other videos to enhance object features in the current video frames. Besides, to further speed up the detection, we design an object tracker to perform object detection for non-key frames based on the detection results of the key frame by leveraging the local continuity property of the video. Considering the trade-off between detection accuracy and speed, the proposed framework achieves superior performance on the ImageNet VID dataset. Source codes will be released to the public via our GitHub website.

引用

页码：3681 / 3693

页数：13

共 50 条

[21] Local-Global Attentive Adaptation for Object Detection
Zhang, Dan
Li, Jingjing
Li, Xingpeng
Du, Zhekai
Xiong, Lin
Ye, Mao
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 100
[22] Local to Global Feature Learning for Salient Object Detection
Feng, Xuelu
Zhou, Sanping
Zhu, Zixin
Wang, Le
Hua, Gang
PATTERN RECOGNITION LETTERS, 2022, 162 : 81 - 88
[23] Salient Object Detection by Fusing Local and Global Contexts
Ren, Qinghua
Lu, Shijian
Zhang, Jinxia
Hu, Renjie
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 1442 - 1453
[24] Video anomaly detection algorithm combining global and local video representation
Hu Z.
Zhao M.
Xin B.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2020, 33 (02): : 133 - 140
[25] Interactive Video Object Segmentation Using Global and Local Transfer Modules
Heo, Yuk
Jun Koh, Yeong
Kim, Chang-Su
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020, 12362 LNCS : 297 - 313
[26] Interactive Video Object Segmentation Using Global and Local Transfer Modules
Heo, Yuk
Koh, Yeong Jun
Kim, Chang-Su
arXiv, 2020,
[27] Video Salient Object Detection Network with Bidirectional Memory and Spatiotemporal Constraints
Wang, Hongyu
Mu, Nan
Zhang, Yu
2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2781 - 2786
[28] CoupleNet: Coupling Global Structure with Local Parts for Object Detection
Zhu, Yousong
Zhao, Chaoyang
Wang, Jinqiao
Zhao, Xu
Wu, Yi
Lu, Hanqing
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4146 - 4154
[29] Combining transformer global and local feature extraction for object detection
Li, Tianping
Zhang, Zhenyi
Zhu, Mengdi
Cui, Zhaotong
Wei, Dongmei
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (04) : 4897 - 4920
[30] Integration of Local and Global Features for Anatomical Object Detection in Ultrasound
Rahmatullah, Bahbibi
Papageorghiou, Aris T.
Noble, J. Alison
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2012, PT III, 2012, 7512 : 402 - 409

← 1 2 3 4 5 →