InternDiffuseDet: Object Detection Method Combining Deformable Convolution and Diffusion Model

被引:0
|
作者
Yuan, Zhixiang [1 ]
Gao, Yongqi [1 ]
机构
[1] School of Computer Science and Technology, Anhui University of Technology, Anhui, Ma’anshan,243032, China
关键词
Convolution;
D O I
10.3778/j.issn.1002-8331.2309-0272
中图分类号
学科分类号
摘要
The paper focuses on the topic of object detection and aims to address issues such as missed detections, limited feature extraction capability, and low detection accuracy in complex scenes. Building upon DiffusionDet, a modified approach is proposed that combines deformable convolutions and diffusion models for object detection. The core idea is to increase the quantity and quality of feature maps before entering the detection head. This is achieved by introducing InternImage and DCNv3 deformable convolution operators into the backbone network, enhancing the receptive field and non-linear modeling capability of the model. An improved feature pyramid network (CS-FPN) based on selective weighting is proposed to enhance the intermediate FPN feature pyramids. Channel and spatial separations are achieved using depth-wise separable convolutions, with the traditional upsampling operation being replaced by the CARAFE operator to improve resolution and semantic information transfer. Following that, the SGE attention mechanism is employed to reassemble the feature maps, ensuring the preservation of hierarchical information during diffusion. Prior to entering the detection head, the DDIM diffusion operation is performed to obtain feature maps at different time steps, thereby augmenting the quantity of detection feature maps. Finally, the EIOU algorithm is introduced in target box matching and loss functions to handle position deviations and scale differences between target boxes. Experimental results on the COCO dataset and road detection dataset demonstrate that the improved model is 3.8 and 3.6 percentage points higher than the original model, respectively, in the same experimental settings. These results indicate the potential of the proposed method to enhance the accuracy and robustness of object detection, providing new insights and approaches for addressing object detection challenges in real-world scenarios. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.
引用
收藏
页码:203 / 215
相关论文
共 50 条
  • [41] TBFF-DAC: Two-branch feature fusion based on deformable attention and convolution for object detection
    Liu, Chuanxi
    Meng, Zhiwei
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 116
  • [42] Depthwise grouped convolution for object detection
    Liao, Yongwei
    Lu, Siwei
    Yang, Zhenguo
    Liu, Wenyin
    MACHINE VISION AND APPLICATIONS, 2021, 32 (06)
  • [43] Branch Convolution Quantization for Object Detection
    Li, Miao
    Zhang, Feng
    Zhang, Cuiting
    Machine Intelligence Research, 2024, 21 (06) : 1192 - 1200
  • [44] Depthwise grouped convolution for object detection
    Yongwei Liao
    Siwei Lu
    Zhenguo Yang
    Wenyin Liu
    Machine Vision and Applications, 2021, 32
  • [45] YOLO Target Detection Algorithm with Deformable Convolution Kernel
    Wang, Hui
    Zhang, Shuai
    Yu, Lijun
    Shi, Ce
    2021 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2021), 2021, : 768 - 772
  • [46] Deformable convolution and coordinate attention for fast cattle detection
    Yang, Wenjie
    Wu, Jiachun
    Zhang, Jinlai
    Gao, Kai
    Du, Ronghua
    Wu, Zhuo
    Firkat, Eksan
    Li, Dingwen
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 211
  • [47] Joint model for learning state recognition with combining action detection and object detection
    Huang, Qiubo
    Liu, Zixuan
    Lu, Ting
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGING SYSTEMS AND TECHNIQUES (IST 2022), 2022,
  • [48] A SURFACE DEFECT DETECTION METHOD OF THE MAGNESIUM ALLOY SHEET BASED ON DEFORMABLE CONVOLUTION NEURAL NETWORK
    Guan, S. Y.
    Zhang, W. Y.
    Jiang, Y. F.
    METALURGIJA, 2020, 59 (03): : 325 - 328
  • [49] A SCENE-SPECIFIC DEFORMABLE PART-BASED MODEL FOR OBJECT DETECTION
    Zhang, Yinghua
    Cai, Ling
    Chen, Luyan
    Zhao, Yuming
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 2324 - 2328
  • [50] An Action Recognition Method Based on Deformable Convolution Network
    Dong, Xu
    Tan, Li
    Zhou, Lina
    Song, Yanyan
    2020 4TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2020), 2020, 1487