Depth-Guided Progressive Network for Object Detection

被引:2
|
作者
Ma, Jia-Wei [1 ,2 ]
Liang, Min [2 ]
Chen, Song-Lu [1 ,2 ]
Chen, Feng [3 ]
Tian, Shu [2 ]
Qin, Jingyan [4 ]
Yin, Xu-Cheng [1 ,2 ]
机构
[1] Univ Sci & Technol Beijing, USTB EEasyTech Joint Lab Artificial Intelligence, Beijing 100083, Peoples R China
[2] Univ Sci & Technol Beijing, Dept Comp Sci & Technol, Beijing 100083, Peoples R China
[3] EEasy Technol Co Ltd, Zhuhai 519000, Peoples R China
[4] Univ Sci & Technol Beijing, Dept Ind Design, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Object detection; Detectors; Interference; Signal to noise ratio; Semantics; Location awareness; multi-scale object; depth-guided; progressive sampling;
D O I
10.1109/TITS.2022.3156365
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Multi-scale object detection in natural scenes is still challenging. To enhance the multi-scale perception capability, some algorithms combine the lower-level and higher-level information via multi-scale feature fusion strategies. However, the inherent spatial properties among instances and relations between foreground and background are ignored. In addition, the human-defined ``center-based'' regression quality evaluation strategy, predicting a high-to-low score based on a linear relationship with the distance to the center of ground-truth box, is not robust to scale-variant objects. In this work, we propose a Depth-Guided Progressive Network (DGPNet) for multi-scale object detection. Specifically, besides the prediction of classification and localization, the depth is estimated and used to guide the image features in a weighted manner to obtain a better spatial representation. Therefore, depth estimation and 2D object detection are simultaneously learned via a unified network, where the depth features are merged as auxiliary information into the detection branch to enhance the discrimination among multi-scale objects. Moreover, to overcome the difficulty of empirically fitting the localization quality function, high-quality predicted boxes on scale-variant objects are more adaptively obtained by an IoU-aware progressive sampling strategy. We divide the sampling process into two stages, i.e., ``statistical-aware'' and ``IoU-aware''. The former selects thresholds for positive samples based on statistical characteristics of multi-scale instances, and the latter further selects high-quality samples by IoU on the basis of the former. Therefore, the final ranking scores better reflect the quality of localization. Experiments verify that our method outperforms state-of-the-art methods on the KINS and Cityscapes dataset.
引用
收藏
页码:19523 / 19533
页数:11
相关论文
共 50 条
  • [31] Depth-guided asymmetric CycleGAN for rain synthesis and image deraining
    Qi, Yinhe
    Zhang, Huanrong
    Jin, Zhi
    Liu, Wanquan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (25) : 35935 - 35952
  • [32] PROBABILISTIC DEPTH-GUIDED MULTI-VIEW IMAGE DENOISING
    Lee, Chul
    Kim, Chang-Su
    Lee, Sang-Uk
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 905 - 908
  • [33] Monocular 3-D Object Detection Based on Depth-Guided Local Convolution for Smart Payment in D2D Systems
    Li, Jun
    Song, Wei
    Gao, Yongbin
    Wang, Huixing
    Yan, Yier
    Huang, Bo
    Zhang, Jun
    Wang, Wei
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (03) : 2245 - 2254
  • [34] PDNET: PRIOR-MODEL GUIDED DEPTH-ENHANCED NETWORK FOR SALIENT OBJECT DETECTION
    Zhu, Chunbiao
    Cai, Xing
    Huang, Kan
    Li, Thomas H.
    Li, Ge
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 199 - 204
  • [35] DEPTH-GUIDED INPAINTING ALGORITHM FOR FREE-VIEWPOINT VIDEO
    Ma, Lingni
    Do, Luat
    de With, Peter H. N.
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 1721 - 1724
  • [36] Depth-guided asymmetric CycleGAN for rain synthesis and image deraining
    Yinhe Qi
    Huanrong Zhang
    Zhi Jin
    Wanquan Liu
    Multimedia Tools and Applications, 2022, 81 : 35935 - 35952
  • [37] DIFu: Depth-Guided Implicit Function for Clothed Human Reconstruction
    Song, Dae-Young
    Lee, HeeKyung
    Seo, Jeongil
    Cho, Donghyeon
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8738 - 8747
  • [38] Novel View Synthesis via Depth-guided Skip Connections
    Hou, Yuxin
    Solin, Arno
    Kannala, Juho
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3118 - 3127
  • [39] Object Guided External Memory Network for Video Object Detection
    Deng, Hanming
    Hua, Yang
    Song, Tao
    Zhang, Zongpu
    Xue, Zhengui
    Ma, Ruhui
    Robertson, Neil
    Guan, Haibing
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6677 - 6686
  • [40] Dynamic Weighted Fusion and Progressive Refinement Network for Visible-Depth-Thermal Salient Object Detection
    Luo Y.
    Shao F.
    Mu B.
    Chen H.
    Li Z.
    Jiang Q.
    IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (11) : 1 - 1