InternDiffuseDet: Object Detection Method Combining Deformable Convolution and Diffusion Model

被引:0
|
作者
Yuan, Zhixiang [1 ]
Gao, Yongqi [1 ]
机构
[1] School of Computer Science and Technology, Anhui University of Technology, Anhui, Ma’anshan,243032, China
关键词
Convolution;
D O I
10.3778/j.issn.1002-8331.2309-0272
中图分类号
学科分类号
摘要
The paper focuses on the topic of object detection and aims to address issues such as missed detections, limited feature extraction capability, and low detection accuracy in complex scenes. Building upon DiffusionDet, a modified approach is proposed that combines deformable convolutions and diffusion models for object detection. The core idea is to increase the quantity and quality of feature maps before entering the detection head. This is achieved by introducing InternImage and DCNv3 deformable convolution operators into the backbone network, enhancing the receptive field and non-linear modeling capability of the model. An improved feature pyramid network (CS-FPN) based on selective weighting is proposed to enhance the intermediate FPN feature pyramids. Channel and spatial separations are achieved using depth-wise separable convolutions, with the traditional upsampling operation being replaced by the CARAFE operator to improve resolution and semantic information transfer. Following that, the SGE attention mechanism is employed to reassemble the feature maps, ensuring the preservation of hierarchical information during diffusion. Prior to entering the detection head, the DDIM diffusion operation is performed to obtain feature maps at different time steps, thereby augmenting the quantity of detection feature maps. Finally, the EIOU algorithm is introduced in target box matching and loss functions to handle position deviations and scale differences between target boxes. Experimental results on the COCO dataset and road detection dataset demonstrate that the improved model is 3.8 and 3.6 percentage points higher than the original model, respectively, in the same experimental settings. These results indicate the potential of the proposed method to enhance the accuracy and robustness of object detection, providing new insights and approaches for addressing object detection challenges in real-world scenarios. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.
引用
收藏
页码:203 / 215
相关论文
共 50 条
  • [21] A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields
    Zuo Z.
    Zhang W.
    Zhang D.
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2019, 48 (06): : 718 - 726
  • [22] A Remote Sensing Image Semantic Segmentation Method by Combining Deformable Convolution with Conditional Random Fields
    Zongcheng ZUO
    Wen ZHANG
    Dongying ZHANG
    Journal of Geodesy and Geoinformation Science, 2020, 3 (02) : 114 - 114
  • [23] A Remote Sensing Image Semantic Segmentation Method by Combining Deformable Convolution with Conditional Random Fields
    Zongcheng ZUO
    Wen ZHANG
    Dongying ZHANG
    Journal of Geodesy and Geoinformation Science, 2020, 3 (03) : 39 - 49
  • [24] Visual Object Detection Using Deformable Sparse Coding Model
    Mei, Xueyan
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), VOL 1, 2016, : 174 - 178
  • [25] Deformable Capsules for Object Detection
    LaLonde, Rodney
    Khosravan, Naji
    Bagci, Ulas
    ADVANCED INTELLIGENT SYSTEMS, 2024, 6 (09)
  • [26] Deformable Part-Based Model Transfer for Object Detection
    Ruan, Zhiwei
    Wang, Guijin
    Lin, Xinggang
    Xue, Jing-Hao
    Jiang, Yong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (05) : 1394 - 1397
  • [27] Adaptive Convolution for Object Detection
    Chen, Chunlin
    Ling, Qiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (12) : 3205 - 3217
  • [28] Pulmonary Nodules Detection Based on Deformable Convolution
    Gu, Junhua
    Tian, Zepei
    Qi, Yongjun
    IEEE ACCESS, 2020, 8 : 16302 - 16309
  • [29] Saliency Detection with Deformable Convolution and Feature Attention
    Zhang, Zhe
    Ma, Junhui
    Xu, Panpan
    Wang, Wencheng
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2800 - 2807
  • [30] RMDC: Rotation-mask deformable convolution for object detection in top-view fisheye cameras
    Wei, Xuan
    Wei, Yun
    Lu, Xiaobo
    NEUROCOMPUTING, 2022, 504 : 99 - 108