Lightweight object detection based on split attention and linear transformation

被引：0

作者：

Zhang Y. ^{[1
,2
]}

Sun J.-X. ^{[1
]}

Sun Y.-M. ^{[1
,2
]}

Liu S.-D. ^{[1
,2
]}

Wang C.-Q. ^{[3
]}

机构：

[1] School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin

[2] Tianjin Intelligent Elderly Care and Health Service Engineering Research Center, Tianjin

[3] Tianjin Keyvia Electric Limited Company, Tianjin

来源：

Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science) | 2023年 / 57卷 / 06期

关键词：

lightweight; linear transformation; object detection; pyramid split attention; YOLO;

D O I：

10.3785/j.issn.1008-973X.2023.06.015

中图分类号：

学科分类号：

摘要：

To meet the real-time and model lightweight requirements of target detection and improve the accuracy of object detection, a lightweight target detection algorithm PG-YOLOv5 based on pyramid split attention and linear transformation was proposed. The feature fusion module in YOLOv5 was optimized by PG-YOLOv5. First, the pyramid split attention module was used to capture the spatial information of feature maps at different scales to enrich the feature space, thus the multi-scale feature representation ability of the network and the accuracy of object detection were improved. Then, the GhostBottleNeck module based on linear transformation was used to combine a small amount of original feature maps with those obtained from linear transformation, which reduced the number of model parameters effectively. The mean average precision of the algorithm increased from 81.2% of YOLOv5L to 85.7% of PG-YOLOv5, and the number of parameters of PG-YOLOv5 was 36% lower than that of YOLOv5L. The PG-YOLOv5 was deployed on Jetson TX2 and an object detection software was designed. Experimental results showed that the image processing speed of the target detection system based on Jetson TX2 was 262.1 ms/frame, and the mean average precision of PG-YOLOv5 was 85.2%. Compared with the YOLOv5L original model, PGYOLOv5 is more suitable for edge deployment. © 2023 Zhejiang University. All rights reserved.

引用

页码：1195 / 1204

页数：9

共 23 条

[1] ZHANG De-xiang, WANG Jun, YUAN Pei-cheng, Object detection method for multi-scale full-scene surveillance based on attention mechanism [J], Journal of Electronics and Information Technology, 44, 9, pp. 3249-3257, (2022)
[2] YUAN Yi-qin, HE Guo-jin, WANG Gui-zhou, Et al., A background subtraction and frame subtraction combined method for moving vehicle detection in satellite video data [J], Journal of University of Chinese Academy of Sciences, 35, 1, pp. 50-58, (2018)
[3] ZHU J, ZOU H, ROSSET S, Et al., Multi-class AdaBoost [J], Statistics and its Interface, 2, pp. 349-360, (2009)
[4] GIRSHICK R, DONAHUE J, DARRELL T, Et al., Rich feature hierarchies for accurate object detection and semantic segmentation [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, (2014)
[5] GIRSHICK R., Fast R-CNN [C], Proceedings of the IEEE International Conference on Computer Vision, pp. 1440-1448, (2015)
[6] REDMON J, DIVVALA S, GIRSHICK R, Et al., You only look once: unified, real-time object detection [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, (2016)
[7] REDMON J, FARHADI A., YOLO9000: better, faster, stronger [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263-7271, (2017)
[8] WANG Li-hui, YANG Xian-zhao, LIU Hui-kang, Et al., Pedestrian detection and tracking algorithm based on GhostNet and attention mechanism [J], Journal of Data Acquisition and Processing, 37, 1, pp. 108-121, (2022)
[9] WOO S, PARK J, LEE J Y, Et al., CBAM: convolutional block attention module [C], Proceedings of the European Conference on Computer Vision, pp. 3-19, (2018)
[10] ZHANG H, ZU K, LU J, Et al., EPSANet: an efficient pyramid squeeze attention block on convolutional neural network [C], Proceedings of the Asian Conference on Computer Vision, pp. 1161-1177, (2022)

← 1 2 3 →