Multi-level refinement enriched feature pyramid network for object detection

被引:0
|
作者
Aziz, Lubna [1 ,2 ]
Salam, Md. Sah Bin Haji F. C. [1 ]
Ayub, Sara [3 ]
机构
[1] Univ Teknol Malaysia, Sch Comp, Div Artificial Intelligence, Fac Engn, Skudai 81310, Johor, Malaysia
[2] FICT BUITEMS, Dept Comp Engn, Lahore, Pakistan
[3] Univ Teknol Malaysia, Dept Elect Engn, Fac Engn, Skudai 81310, Johor, Malaysia
关键词
CNN; Object detection; Chained parallel pooling; Computer vision; Feature pyramid;
D O I
10.1016/j.imavis.2021.104287
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class Imbalance and scales imbalance are common in object detection. A class imbalance occurs due to insufficient inequality between the number of instances with respect to different classes, while an imbalance in scale occurs when object have different scales and a different number of examples of different scales. In order to solve the problem of scale variance (scale imbalance) and class imbalance together, we propose a simple and effective feature enhancement scheme that explicitly uses all information of a multi-level structure to generate a multilevel contextual features pyramid with multiple scales. We also introduce a cascaded refinement scheme that incorporates multi-scale contextual features into the Single Shot Detector (SSD) predictive layers to improve their distinctiveness for multi-scale detection. A stack of multi-scale contextual feature modules is used in a feature enhancement scheme to merge the multi-level and multi-scale features. Then we collect the equivalent scale features over the Multi-layer Feature Fusion (MLFF) unit to construct a feature pyramid in which each feature map is made up of layers from multiple levels. More robustness and contextual information are integrated into the pyramid through chain parallel pooling operation. To improve classification and regression, a cascaded refinement scheme is proposed that effectively captures a large amount of contextual information and refines the anchors to solve the class imbalance problem. The experiments are carried out on two benchmarks datasets: MS COCO and PASCAL VOC 07/12. Our proposed approach achieves state-of-the-art accuracy with an AP of 40.6 in the case of multi-scale inference on MS COCO Test-dev (input size 320 x 320). For 512 x 512 input on the MS COCO Test-dev, our approach leads in an absolute gain in precision of 1.8% compared to the best reported results of single-stage detector (AP: 45.7). (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Multi-level refinement enriched feature pyramid network for object detection
    Aziz, Lubna
    FC, Md. Sah Bin Haji Salam
    Ayub, Sara
    [J]. Image and Vision Computing, 2021, 115
  • [2] Multi-Level Refinement Feature Pyramid Network for Scale Imbalance Object Detection
    Aziz, Lubna
    Salam, Md Sah Bin Haji
    Sheikh, Usman Ullah
    Khan, Surat
    Ayub, Huma
    Ayub, Sara
    [J]. IEEE ACCESS, 2021, 9 : 156492 - 156506
  • [3] Multi-level feature fusion pyramid network for object detection
    Guo, Zebin
    Shuai, Hui
    Liu, Guangcan
    Zhu, Yisheng
    Wang, Wenqing
    [J]. VISUAL COMPUTER, 2023, 39 (09): : 4267 - 4277
  • [4] Multi-level feature fusion pyramid network for object detection
    Zebin Guo
    Hui Shuai
    Guangcan Liu
    Yisheng Zhu
    Wenqing Wang
    [J]. The Visual Computer, 2023, 39 : 4267 - 4277
  • [5] Feature refinement with multi-level context for object detection
    Yingdong Ma
    Yanan Wang
    [J]. Machine Vision and Applications, 2023, 34
  • [6] Feature refinement with multi-level context for object detection
    Ma, Yingdong
    Wang, Yanan
    [J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (04)
  • [7] MLA-Net: Feature Pyramid Network with Multi-Level Local Attention for Object Detection
    Yang, Xiaobao
    Wang, Wentao
    Wu, Junsheng
    Ding, Chen
    Ma, Sugang
    Hou, Zhiqiang
    [J]. MATHEMATICS, 2022, 10 (24)
  • [8] Enriched Feature Guided Refinement Network for Object Detection
    Nie, Jing
    Anwer, Rao Muhammad
    Cholakkal, Hisham
    Khan, Fahad Shahbaz
    Pang, Yanwei
    Shao, Ling
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9536 - 9545
  • [9] Multi-level feature enhancement network for object detection in sonar images
    Zhou, Xin
    Zhou, Zihan
    Wang, Manying
    Ning, Bo
    Wang, Yanhao
    Zhu, Pengli
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
  • [10] Cross-modal and multi-level feature refinement network for RGB-D salient object detection
    Gao, Yue
    Dai, Meng
    Zhang, Qing
    [J]. VISUAL COMPUTER, 2023, 39 (09): : 3979 - 3994