Relation-Guided Multi-stage Feature Aggregation Network for Video Object Detection

被引:0
|
作者
Yao, Tingting [1 ]
Cao, Fuxiao [1 ]
Mi, Fuheng [1 ]
Li, Danmeng [1 ]
机构
[1] Dalian Maritime Univ, Coll Informat Sci & Technol, Dalian 116026, Peoples R China
基金
中国国家自然科学基金;
关键词
Video object detection; Temporal context information; Feature aggregation; Temporal relation-guided;
D O I
10.1007/978-981-99-8537-1_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video object detection task has received extensive research attention and various methods have been proposed. The quality of single frame in the original video is usually deteriorated by motion blur and object occlusion, which leads to the failure of detection. Although some methods have attempted to enhance the feature representation of each frame by aggregating temporal context information from other frames, the existing methods are usually sensitive to the change of object appearance and scale, which lead to false or missing detection. Therefore, in this paper, we propose a Relation-guided Multi-stage Feature Aggregation (RMFA) network for video object detection. First, a Multi-Stage Feature Aggregation (MSFA) framework is devised to aggregate the feature representation of global and local support frames in each stage. In this way, both global semantic information and local motion information could be better captured. Furthermore, a Multi-sources Feature Aggregation (MFA) module is proposed to enhance the quality of support frames, hence the feature representation of current frame could be improved. Finally, a Temporal Relation-Guided (TRG) module is proposed to improve the feature aggregation perception by supervising the semantic similarity relationships between different object proposals. Therefore, the model adaptability to selectively store valuable features could be enhanced. Qualitative and quantitative experimental results on the ImageNet VID dataset demonstrate that our model could achieve superior video object detection results against a number of the state-of-the-art ones. Especially, when object is occluded or under fast motion, our model shows outstanding performances.
引用
收藏
页码:146 / 157
页数:12
相关论文
共 50 条
  • [1] Global Context Relation-Guided Feature Aggregation Network for Salient Object Detection in Optical Remote Sensing Images
    Li, Jian
    Li, Chuankun
    Zheng, Xiao
    Liu, Xinwang
    Tang, Chang
    REMOTE SENSING, 2024, 16 (16)
  • [2] Flow-Guided Feature Aggregation for Video Object Detection
    Zhu, Xizhou
    Wang, Yujie
    Dai, Jifeng
    Yuan, Lu
    Wei, Yichen
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 408 - 417
  • [3] GUIDED SAMPLING BASED FEATURE AGGREGATION FOR VIDEO OBJECT DETECTION
    Liang, Jun
    Chen, Haosheng
    Yan, Yan
    Lu, Yang
    Wang, Hanzi
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1116 - 1120
  • [4] A Multi-Stage Feature Aggregation and Structure Awareness Network for Concrete Bridge Crack Detection
    Zhang, Erhu
    Jiang, Tao
    Duan, Jinghong
    SENSORS, 2024, 24 (05)
  • [5] Attention guided multi-level feature aggregation network for camouflaged object detection
    Wang, Anzhi
    Ren, Chunhong
    Zhao, Shuang
    Mu, Shibiao
    IMAGE AND VISION COMPUTING, 2024, 144
  • [6] Semantic Guided Feature Aggregation Network for Salient Object Detection
    Wang Z.-W.
    Song H.-H.
    Fan J.-Q.
    Liu Q.-S.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (11): : 2386 - 2395
  • [7] Attention-Guided Disentangled Feature Aggregation for Video Object Detection
    Muralidhara, Shishir
    Hashmi, Khurram Azeem
    Pagani, Alain
    Liwicki, Marcus
    Stricker, Didier
    Afzal, Muhammad Zeshan
    SENSORS, 2022, 22 (21)
  • [8] Boundary-Guided Feature Aggregation Network for Salient Object Detection
    Zhuge, Yunzhi
    Yang, Gang
    Zhang, Pingping
    Lu, Huchuan
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (12) : 1800 - 1804
  • [9] Multi-feature aggregation network for salient object detection
    Huang, Hu
    Liu, Ping
    Wang, Yanzhao
    Zhou, Tongchi
    Qu, Boyang
    Tao, Aimin
    Zhang, Hao
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1043 - 1051
  • [10] Class-Aware Feature Aggregation Network for Video Object Detection
    Han, Liang
    Wang, Pichao
    Yin, Zhaozheng
    Wang, Fan
    Li, Hao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8165 - 8178