YOLO-SA: An Efficient Object Detection Model Based on Self-attention Mechanism

被引：0

作者：

Li, Ang ^{[1
]}

Song, Xiangyu ^{[2
]}

Sun, ShiJie ^{[1
]}

Zhang, Zhaoyang ^{[1
]}

Cai, Taotao ^{[3
]}

Song, Huansheng ^{[1
]}

机构：

[1] Changan Univ, Xian, Peoples R China

[2] Swinburne Univ Technol, Melbourne, Vic, Australia

[3] Macquarie Univ, Sydney, NSW, Australia

来源：

WEB AND BIG DATA, PT IV, APWEB-WAIM 2023 | 2024年 / 14334卷

关键词：

Object detection; CNN architecture; Attention mechanism; Decoupled detection head;

D O I：

10.1007/978-981-97-2421-5_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detector based on CNN structure has been widely used in object detection, object classification and other tasks. The traditional CNN module usually adopts complex multi-branch design, which reduces the reasoning speed and memory utilization. Moreover, in many works, attention mechanism is usually added to the object detector to extract rich features in spatial information, which are usually used as additional modules of convolution without fundamental improvement from the limitations of convolution operation. Finally, traditional object detectors often have coupled detection heads, which can compromise model performance. To solve the above problems, we propose a new object detection model, YOLO-SA, based on the current popular object detector model YOLOv5. We introduce a new reparameterized module RepVGG to replace the original DarkNet53 structure of YOLOv5 model, which greatly reduces the complexity of the model and improves the detection accuracy. We introduce a self-attention mechanism module in the feature fusion part of the model, which is independent from other convolutional layers and has higher performance than other mainstream attention mechanism modules. We replace the coupled detection head in YOLOv5 model with an anchor-based decoupled detection head, which greatly improved the convergence speed in the training process. Experiments show that the detection accuracy of the YOLO-SA model proposed by us reaches 71.2% and 75.8% on COCO2014 and VOC2012 dataset respectively, which is superior to the YOLOv5s model as the baseline and other mainstream object detection models, showing certain superiority.

引用

页码：1 / 15

页数：15

共 50 条

[21] DPNET: DUAL-PATH NETWORK FOR EFFICIENT OBJECT DETECTION WITH LIGHTWEIGHT SELF-ATTENTION
Shi, Huimin
Zhou, Quan
Ni, Yinghao
Wu, Xiaofu
Latecki, Longin Jan
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 771 - 775
[22] Cross-stage feature fusion and efficient self-attention for salient object detection
Xia, Xiaofeng
Ma, Yingdong
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
[23] Cascaded feature fusion with multi-level self-attention mechanism for object detection
Wang, Chuanxu
Wang, Huiru
[J]. PATTERN RECOGNITION, 2023, 138
[24] RoI Fusion Strategy With Self-Attention Mechanism for Object Detection in Remote Sensing Images
Zhang, Yuxi
Wang, Yongcheng
Zhang, Ning
Li, Zheng
Zhao, Zhikang
Gao, Yunxiao
Chen, Chi
Feng, Hao
[J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 5990 - 6006
[25] SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection
Bhattacharyya, Prarthana
Huang, Chengjie
Czarnecki, Krzysztof
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3022 - 3031
[26] Progressive Domain Adaptive Object Detection Based on Self-Attention in Foggy Weather
Lin, Meng
Zhou, Gang
Yang, Yawei
Shi, Jun
[J]. IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2023, 18 (12) : 1923 - 1931
[27] Epilepsy detection based on multi-head self-attention mechanism
Ru, Yandong
An, Gaoyang
Wei, Zheng
Chen, Hongming
[J]. PLOS ONE, 2024, 19 (06):
[28] Grid self-attention mechanism 3D object detection method based on raw point cloud
Lu B.
Sun Y.
Yang Z.
[J]. Tongxin Xuebao/Journal on Communications, 2023, 44 (10): : 72 - 84
[29] TLS-MHSA: An Efficient Detection Model for Encrypted Malicious Traffic based on Multi-Head Self-Attention Mechanism
Chen, Jinfu
Song, Luo
Cai, Saihua
Xie, Haodi
Yin, Shang
Ahmad, Bilal
[J]. ACM TRANSACTIONS ON PRIVACY AND SECURITY, 2023, 26 (04)
[30] LSTM-MH-SA landslide displacement prediction model based on multi-head self-attention mechanism
Zhang, Zhen-kung
Zhang, Dong-mei
Li, Jiang
Wu, Yi-ping
[J]. ROCK AND SOIL MECHANICS, 2022, 43 : 477 - +

← 1 2 3 4 5 →