MLSA-YOLO: a multi-level feature fusion and scale-adaptive framework for small object detectionMLSA-YOLO: a multi-level feature fusion and scale-adaptive...J. Peng et al.

被引:0
|
作者
Jiayu Peng [1 ]
Kai Lv [2 ]
Guoliang Wang [2 ]
Wendong Xiao [2 ]
Teng Ran [2 ]
Liang Yuan [2 ]
机构
[1] Xinjiang University,School of Software
[2] Xinjiang University,School of Mechanical Engineering
关键词
YOLOv8; Small object detection; Multi-level feature fusion; Scale-adaptive;
D O I
10.1007/s11227-025-06961-0
中图分类号
学科分类号
摘要
Due to the limited target area occupied by small objects, certain feature extraction paradigms that are not well-suited for small objects can further exacerbate the loss of their already limited information. Additionally, inconsistencies between features at different levels in FPN can result in suboptimal feature fusion, hindering the accurate representation of multi-scale features. As a result, even high-performance detectors struggle to recognize small objects effectively. To resolve the above issues, we propose MLSA-YOLO, a small object detection algorithm based on multi-level feature fusion and scale-adaptive. Initially, we restructured the network architecture using SPD-Conv with the proposed Convolutional Space-to-Depth (CSPD) module to improve the network’s capacity for capturing local spatial details in images and to ensure that information is preserved during the downsampling process. Furthermore, to address the challenges in feature fusion, we employed a three-layer PAFPN structure at the neck and combined it with the proposed multi-level Feature Fusion and Scale-Adaptive (MLSA) feature pyramid network. This method enhances the complementarity of multi-level information, while effectively filtering the conflicting information generated during the fusion phase. To improve the quality of feature extraction, we incorporated the designed DCN_C2f module into the neck network. This module can accurately capture foreground object features, while enhancing the network’s adaptability to geometric deformations of objects. Experimental results show that our approach performs better than other state-of-the-art detection algorithms on the VisDrone2019, DOTA, and FocusTiny datasets. Compared to YOLOv8s, mAP50 improved by 9.5%, 3.4%, and 5.1%, respectively.
引用
收藏
相关论文
共 37 条
  • [1] SMFF-YOLO: A Scale-Adaptive YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes
    Wang, Yuming
    Zou, Hua
    Yin, Ming
    Zhang, Xining
    REMOTE SENSING, 2023, 15 (18)
  • [2] Salient Object Detection Based on Multi-scale Feature Extraction and Multi-level Feature Fusion
    Li, Lingli
    Meng, Lingbing
    Li, Jinbao
    Gongcheng Kexue Yu Jishu/Advanced Engineering Sciences, 2021, 53 (01): : 170 - 177
  • [3] Small-Scale Pedestrian Detection Based on Multi-level Feature Fusion
    Yan, Chaoqi
    Zhang, Hong
    Li, Xuliang
    Yang, Yifan
    Chen, Hao
    Yuan, Ding
    THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083
  • [4] Human Pose Estimation with Multi-Scale and Multi-Level Feature Fusion
    Wang, Yanni
    Hu, Min
    Han, Shipeng
    Chen, Yixuan
    Lyu, Hao
    Computer Engineering and Applications, 2025, 61 (06) : 199 - 209
  • [5] Image segmentation algorithm based on multi-level feature adaptive fusion
    Yuan X.-P.
    He X.
    Wang X.-Q.
    Hu Y.-M.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2022, 56 (10): : 1958 - 1966
  • [6] Towards Accurate Oriented Object Detection in Aerial Images with Adaptive Multi-level Feature Fusion
    Zhen, Peining
    Wang, Shuqi
    Zhang, Suming
    Yan, Xiaotao
    Wang, Wei
    Ji, Zhigang
    Chen, Hai-Bao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
  • [7] MLANet: multi-level attention network with multi-scale feature fusion for crowd counting
    Xiong, Liyan
    Zeng, Yijuan
    Huang, Xiaohui
    Li, Zhida
    Huang, Peng
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (05): : 6591 - 6608
  • [8] Road Recognition Based on Multi-scale Convolutional Network with Multi-level Feature Fusion
    Li, Ye
    Guo, Lili
    Xu, Lele
    Wang, Xianfeng
    Jin, Shan
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [9] Medical Image Segmentation with Dual-Encoding and Multi-Level Feature Adaptive Fusion
    Wu, Shulei
    Yang, You
    Zhang, Fanghong
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (04)
  • [10] Small-scale Pedestrian Detection Algorithm Based on Attention and Multi-level Feature Fusion for Railway
    Shi, Ruijiao
    Chen, Houjin
    Li, Jupeng
    Li, Yanfeng
    Li, Feng
    Wan, Chengkai
    Tiedao Xuebao/Journal of the China Railway Society, 2022, 44 (05): : 76 - 83