Multi-level Proposal Relations Aggregation for Video Object Detection

被引:0
|
作者
Yu, Chongkai [1 ]
Chen, Wenjie [1 ]
Wu, Bing [1 ]
机构
[1] Beijing Inst Technol, Sch Automat, State Key Lab Intelligent Control & Decis Complex, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Video object detection; Relation aggregation; Global-local information;
D O I
10.1007/978-3-031-15919-0_61
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video information often deteriorates in certain frames, which is a great challenge for object detection. It is difficult to identify the object in this frame by just utilizing the information of one frame. Recently, plenty of studies have shown that context aggregating information through the self-attention mechanism can enhance the features in key frames. However, these methods only exploit some of inter-video and intra-video global-local information, not all of it. Global semantic and local localization information in the same video can assist object classification and regression. The intra-proposal relation among different videos can provide important cues to distinguish confusing objects. All of this information is able to enhance the performance of video object detection. In this paper, we design a Multi-Level Proposal Relations Aggregation network to mine inter-video and intra-video global-local pro-posal relations. For intra-video, we effectively aggregate global and local information to augments the proposal features of key frames. For inter-video, we aggregate the inter-video key frame features to the target video under the constraint of relation regularization. We flexibly utilize the relation module to aggregate the proposals from different frames. Experiments on ImageNet VID dataset demonstrate the effectiveness of our method.
引用
收藏
页码:734 / 745
页数:12
相关论文
共 50 条
  • [1] MAMBA: Multi-level Aggregation via Memory Bank for Video Object Detection
    Sun, Guanxiong
    Hua, Yang
    Hu, Guosheng
    Robertson, Neil
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2620 - 2627
  • [2] Video object detection algorithm based on multi-level feature aggregation under mixed sampler
    Qin S.
    Gai S.
    Da F.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (01): : 10 - 19
  • [3] Attention guided multi-level feature aggregation network for camouflaged object detection
    Wang, Anzhi
    Ren, Chunhong
    Zhao, Shuang
    Mu, Shibiao
    IMAGE AND VISION COMPUTING, 2024, 144
  • [4] MFA: Multi-level Feature Aggregation for Video Recognition
    Li, Na
    Fan, Kuangang
    Qinghua, Ouyang
    Liu, Yahui
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 67 - 72
  • [5] Sequence Level Semantics Aggregation for Video Object Detection
    Wu, Haiping
    Chen, Yuntao
    Wang, Naiyan
    Zhang, Zhaoxiang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9216 - 9224
  • [6] MULTI-LEVEL MODEL FOR VIDEO SALIENCY DETECTION
    Bi, Hongbo
    Lu, Di
    Li, Ning
    Yang, Lina
    Guan, Huaping
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4654 - 4658
  • [7] Multi-Level Context Aggregation Network with Channel-Wise Attention for Salient Object Detection
    Jia, Zihui
    Weng, Zhenyu
    Wan, Fang
    Zhu, Yuesheng
    IEEE Access, 2020, 8 : 102303 - 102312
  • [8] Small Object Detection Method Based on Global Multi-Level Perception and Dynamic Region Aggregation
    Zhu, Zhiqin
    Zheng, Renzhong
    Qi, Guanqiu
    Li, Shuang
    Li, Yuanyuan
    Gao, Xinbo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 10011 - 10022
  • [9] Multi-Level Context Aggregation Network With Channel-Wise Attention for Salient Object Detection
    Jia, Zihui
    Weng, Zhenyu
    Wan, Fang
    Zhu, Yuesheng
    IEEE ACCESS, 2020, 8 : 102303 - 102312
  • [10] Feature refinement with multi-level context for object detection
    Ma, Yingdong
    Wang, Yanan
    MACHINE VISION AND APPLICATIONS, 2023, 34 (04)