Lightweight DETR-YOLO method for detecting shipwreck target in side-scan sonar

被引:0
|
作者
Tang Y. [1 ]
Li H. [1 ]
Zhang W. [2 ]
Bian S. [1 ]
Zhai G. [3 ]
Liu M. [4 ]
Zhang X. [5 ]
机构
[1] College of Electrical Engineering, Naval University of Engineering, Wuhan
[2] System Demonstration Center of Battle Environment Security Bureau of Central Military Commission, Beijing
[3] Naval Institute of Oceangraphic Surveying and Mapping, Tianjin
[4] Unit 91001 of the Pla, Beijing
[5] Information Network Center, China University of Geosciences (Beijing), Beijing
关键词
DETR-YOLO model; multi-scale feature complex fusion; weighted boxes fusion (WBF);
D O I
10.12305/j.issn.1001-506X.2022.08.06
中图分类号
学科分类号
摘要
Although the side-scan sonar shipwreck target detection method based on the YOLOv5 algorithm has achieved good results in detection accuracy and speed, however, how to further improve the accuracy of small target detection under the background of complex ocean noise, reduce the missed alarm rate and false alarm rate of overlapping targets, and realize the lightweight of the model is an urgent issue to be solved. To this end, this paper innovatively integrates the structure of DETR (end-to-end object detection with transformers) and YOLOv5, and proposes a lightweight side-scan sonar shipwreck detection model based on the DETR-YOLO model. Firstly, a multi-scale feature complex fusion module is added to improve the detection ability of small targets. Then, the attention mechanism SENet (squeeze-and-excitation networks) is integrated strengthen the sensitivity to important channel features. Finally WBF (weighted boxes fusion) weighted fusion frame strategy is adopted to improve the positioning accuracy and confidence of the detection frame. The experimental results show that the AP_0.5 and AP_0.5∶0.95 values of this model in the test set reach 84.5% and 57.7%, respectively, which are greatly improved compared with the Transfermer and YOLOv5a models.At the expense of smaller detection efficiency loss and weight increase, higher detection accuracy has been achieved. At the same time, it can improve the full-scene understanding ability and the small-scale overlapping target processing ability while meeting the needs of lightweight engineering deployment. © 2022 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:2427 / 2436
页数:9
相关论文
共 32 条
  • [21] VASWANI A, SHAZEER N, PARMAR N, Et al., Attention is all you need, Proc. of the Advances in neural Information Processing Systems, pp. 5998-6008, (2017)
  • [22] FEDUS W, ZOPH B, SHAZEER N., Switch transformers: scaling to trillion parameter models with simple and efficient sparsity
  • [23] PRANGEMEIER T, REICH C, KOEPPL H., Attention-based transformers for instance segmentation of cells in microstructures, Proc. of the IEEE International Conference on Bioinformatics and Biomedicine, pp. 700-707, (2020)
  • [24] YANG F, YANG H, FU J, Et al., Learning texture transformer network for image super-resolution, Proc. of the Computer Vision and Pattern Ecognition, (2020)
  • [25] CHEN H, WANG Y, GUO T, Et al., Pre-trained image processing transformer
  • [26] ZHANG H, GOODFELLOW I, METAXAS D, Et al., Self attention generative adversarial networks, Proc. of the International Conference on Machine Learning, pp. 7354-7363, (2019)
  • [27] CHEN C F, FAN Q, PANDA R., Crossvit: cross-attention multiscale vision transformer for image classification
  • [28] CHEN C F, PANDA R, FAN Q., Regionvit: regional-to-local attention for vision transformers
  • [29] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, Et al., An image is worth 16×16 words: transformers for image recognition at scale, Proc. of the Computer Vision and Pattern Recongnition, (2021)
  • [30] CARION N, MASSA F, SYNNAEVE G, Et al., End-to-end object detection with transformers, Proc. of the European Conference on Computer Vision, (2020)