CTFuseNet: A Multi-Scale CNN-Transformer Feature Fused Network for Crop Type Segmentation on UAV Remote Sensing Imagery

被引:9
|
作者
Xiang, Jianjian [1 ]
Liu, Jia [1 ]
Chen, Du [1 ]
Xiong, Qi [1 ]
Deng, Chongjiu [1 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
precision agriculture; UAV remote sensing; semantic segmentation; deep learning; CNN; transformer; feature fusion;
D O I
10.3390/rs15041151
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Timely and accurate acquisition of crop type information is significant for irrigation scheduling, yield estimation, harvesting arrangement, etc. The unmanned aerial vehicle (UAV) has emerged as an effective way to obtain high resolution remote sensing images for crop type mapping. Convolutional neural network (CNN)-based methods have been widely used to predict crop types according to UAV remote sensing imagery, which has excellent local feature extraction capabilities. However, its receptive field limits the capture of global contextual information. To solve this issue, this study introduced the self-attention-based transformer that obtained long-term feature dependencies of remote sensing imagery as supplementary to local details for accurate crop-type segmentation in UAV remote sensing imagery and proposed an end-to-end CNN-transformer feature-fused network (CTFuseNet). The proposed CTFuseNet first provided a parallel structure of CNN and transformer branches in the encoder to extract both local and global semantic features from the imagery. A new feature-fusion module was designed to flexibly aggregate the multi-scale global and local features from the two branches. Finally, the FPNHead of feature pyramid network served as the decoder for the improved adaptation to the multi-scale fused features and output the crop-type segmentation results. Our comprehensive experiments indicated that the proposed CTFuseNet achieved a higher crop-type-segmentation accuracy, with a mean intersection over union of 85.33% and a pixel accuracy of 92.46% on the benchmark remote sensing dataset and outperformed the state-of-the-art networks, including U-Net, PSPNet, DeepLabV3+, DANet, OCRNet, SETR, and SegFormer. Therefore, the proposed CTFuseNet was beneficial for crop-type segmentation, revealing the advantage of fusing the features found by the CNN and the transformer. Further work is needed to promote accuracy and efficiency of this approach, as well as to assess the model transferability.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Feature fused multi-scale segmentation method for remote sensing imagery
    Chen, T. Q.
    Liu, J. H.
    Wang, Y. H.
    Zhu, F.
    Chen, J.
    Deng, M.
    ADVANCES IN ENERGY, ENVIRONMENT AND MATERIALS SCIENCE, 2016, : 741 - 744
  • [2] Remote sensing image instance segmentation network with transformer and multi-scale feature representation
    Ye, Wenhui
    Zhang, Wei
    Lei, Weimin
    Zhang, Wenchao
    Chen, Xinyi
    Wang, Yanwen
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 234
  • [3] ACTNet: A Dual-Attention Adapter with a CNN-Transformer Network for the Semantic Segmentation of Remote Sensing Imagery
    Zhang, Zheng
    Liu, Fanchen
    Liu, Changan
    Tian, Qing
    Qu, Hongquan
    REMOTE SENSING, 2023, 15 (09)
  • [4] Multi-Scale Orthogonal Model CNN-Transformer for Medical Image Segmentation
    Zhou, Wuyi
    Zeng, Xianhua
    Zhou, Mingkun
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (10)
  • [5] CTFNet: CNN-Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation
    Wu H.
    Huang P.
    Zhang M.
    Tang W.
    IEEE Geoscience and Remote Sensing Letters, 2024, 21 : 1 - 5
  • [6] Multi-scale network for remote sensing segmentation
    Wang, Gaihua
    Zhai, Qianyu
    Lin, Jinheng
    IET IMAGE PROCESSING, 2022, 16 (06) : 1742 - 1751
  • [7] Hybrid CNN and Transformer Network for Semantic Segmentation of UAV Remote Sensing Images
    Zhou X.
    Zhou L.
    Gong S.
    Zhang H.
    Zhong S.
    Xia Y.
    Huang Y.
    IEEE Journal on Miniaturization for Air and Space Systems, 2024, 5 (01): : 33 - 41
  • [8] MFTransNet: A Multi-Modal Fusion with CNN-Transformer Network for Semantic Segmentation of HSR Remote Sensing Images
    He, Shumeng
    Yang, Houqun
    Zhang, Xiaoying
    Li, Xuanyu
    MATHEMATICS, 2023, 11 (03)
  • [9] A CNN-Transformer Combined Remote Sensing Imagery Spatiotemporal Fusion Model
    Jiang, Mingyu
    Shao, Hua
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 13995 - 14009
  • [10] Multi-Scale CNN-Transformer Dual Network for Hyperspectral Compressive Snapshot Reconstruction
    Huang, Kaixuan
    Sun, Yubao
    Gu, Quan
    APPLIED SCIENCES-BASEL, 2023, 13 (23):