A CNN-Transformer Hybrid Model Based on CSWin Transformer for UAV Image Object Detection

被引:24
|
作者
Lu, Wanjie [1 ]
Lan, Chaozhen [2 ]
Niu, Chaoyang [1 ]
Liu, Wei [1 ]
Lyu, Liang [2 ]
Shi, Qunshan [2 ]
Wang, Shiju [1 ]
机构
[1] PLA Strateg Support Force Informat Engn Univ, Inst Data & Target Engn, Zhengzhou 450001, Peoples R China
[2] PLA Strateg Support Force Informat Engn Univ, Inst Geospatial Informat, Zhengzhou 450001, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Transformers; Feature extraction; Detectors; Autonomous aerial vehicles; Computational modeling; Training; Convolutional neural network (CNN); hybrid network; object detection; transformer; unmanned aerial vehicle (UAV) image; NETWORK;
D O I
10.1109/JSTARS.2023.3234161
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The object detection of unmanned aerial vehicle (UAV) images has widespread applications in numerous fields; however, the complex background, diverse scales, and uneven distribution of objects in UAV images make object detection a challenging task. This study proposes a convolution neural network transformer hybrid model to achieve efficient object detection in UAV images, which has three advantages that contribute to improving object detection performance. First, the efficient and effective cross-shaped window (CSWin) transformer can be used as a backbone to obtain image features at different levels, and the obtained features can be input into the feature pyramid network to achieve multiscale representation, which will contribute to multiscale object detection. Second, a hybrid patch embedding module is constructed to extract and utilize low-level information such as the edges and corners of the image. Finally, a slicing-based inference method is constructed to fuse the inference results of the original image and sliced images, which will improve the small object detection accuracy without modifying the original network. Experimental results on public datasets illustrate that the proposed method can improve performance more effectively than several popular and state-of-the-art object detection methods.
引用
下载
收藏
页码:1211 / 1231
页数:21
相关论文
共 50 条
  • [21] Power transmission line anomaly detection scheme based on CNN-transformer model
    Gao, Ming
    Zhang, Wenfei
    INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2021, 12 (04) : 388 - 395
  • [22] Transformer-CNN for small image object detection
    Chen, Yan-Lin
    Lin, Chun-Liang
    Lin, Yu-Chen
    Chen, Tzu-Chun
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2024, 129
  • [23] Weak Appearance Aware Pipeline Leak Detection based on CNN-Transformer Hybrid Architecture
    Zhang, Bulin
    Yuan, Haiwen
    Ge, Jie
    Cheng, Li
    Li, Xuan
    Xiao, Changshi
    IEEE Transactions on Instrumentation and Measurement, 2024,
  • [24] Hybrid CNN-Transformer Network for Electricity Theft Detection in Smart Grids
    Bai, Yu
    Sun, Haitong
    Zhang, Lili
    Wu, Haoqi
    SENSORS, 2023, 23 (20)
  • [25] SaltFormer: A hybrid CNN-Transformer network for automatic salt dome detection
    Li, Yang
    Peng, Suping
    He, Dengke
    Computers and Geosciences, 2025, 195
  • [26] A novel hybrid CNN-Transformer model for EEG Motor Imagery classification
    Ma, Yaxin
    Song, Yonghao
    Gao, Fei
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [27] DACTransNet: A Hybrid CNN-Transformer Network for Histopathological Image Classification of Pancreatic Cancer
    Kou, Yongqing
    Xia, Cong
    Jiao, Yiping
    Zhang, Daoqiang
    Ge, Rongjun
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 422 - 434
  • [28] Hybrid CNN-transformer network for efficient CSI feedback
    Zhao, Ruohan
    Liu, Ziang
    Song, Tianyu
    Jin, Jiyu
    Jin, Guiyue
    Fan, Lei
    PHYSICAL COMMUNICATION, 2024, 66
  • [29] HTC-Net: A hybrid CNN-transformer framework for medical image segmentation
    Tang, Hui
    Chen, Yuanbin
    Wang, Tao
    Zhou, Yuanbo
    Zhao, Longxuan
    Gao, Qinquan
    Du, Min
    Tan, Tao
    Zhang, Xinlin
    Tong, Tong
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 88
  • [30] A CNN-TRANSFORMER HYBRID FEATURE DESCRIPTOR FOR OPTICAL-SAR IMAGE REGISTRATION
    Lin, Mingxin
    Liu, Binyuan
    Liu, Yijun
    Wang, Qingsong
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6069 - 6072