CTFuseNet: A Multi-Scale CNN-Transformer Feature Fused Network for Crop Type Segmentation on UAV Remote Sensing Imagery

被引:9
|
作者
Xiang, Jianjian [1 ]
Liu, Jia [1 ]
Chen, Du [1 ]
Xiong, Qi [1 ]
Deng, Chongjiu [1 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
precision agriculture; UAV remote sensing; semantic segmentation; deep learning; CNN; transformer; feature fusion;
D O I
10.3390/rs15041151
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Timely and accurate acquisition of crop type information is significant for irrigation scheduling, yield estimation, harvesting arrangement, etc. The unmanned aerial vehicle (UAV) has emerged as an effective way to obtain high resolution remote sensing images for crop type mapping. Convolutional neural network (CNN)-based methods have been widely used to predict crop types according to UAV remote sensing imagery, which has excellent local feature extraction capabilities. However, its receptive field limits the capture of global contextual information. To solve this issue, this study introduced the self-attention-based transformer that obtained long-term feature dependencies of remote sensing imagery as supplementary to local details for accurate crop-type segmentation in UAV remote sensing imagery and proposed an end-to-end CNN-transformer feature-fused network (CTFuseNet). The proposed CTFuseNet first provided a parallel structure of CNN and transformer branches in the encoder to extract both local and global semantic features from the imagery. A new feature-fusion module was designed to flexibly aggregate the multi-scale global and local features from the two branches. Finally, the FPNHead of feature pyramid network served as the decoder for the improved adaptation to the multi-scale fused features and output the crop-type segmentation results. Our comprehensive experiments indicated that the proposed CTFuseNet achieved a higher crop-type-segmentation accuracy, with a mean intersection over union of 85.33% and a pixel accuracy of 92.46% on the benchmark remote sensing dataset and outperformed the state-of-the-art networks, including U-Net, PSPNet, DeepLabV3+, DANet, OCRNet, SETR, and SegFormer. Therefore, the proposed CTFuseNet was beneficial for crop-type segmentation, revealing the advantage of fusing the features found by the CNN and the transformer. Further work is needed to promote accuracy and efficiency of this approach, as well as to assess the model transferability.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] A multi-scale semantic feature fusion method for remote sensing crop classification
    Huang, Xizhi
    Wang, Hong
    Li, Xiaobing
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 224
  • [32] Multi-scale and multi-feature high resolution remote sensing image segmentation
    Zhao, Qiang
    Zhang, Sheng
    Huang, Shuling
    International Journal of Applied Mathematics and Statistics, 2013, 51 (22): : 343 - 350
  • [33] Semantic segmentation of multi-scale remote sensing images with contextual feature enhancement
    Zhang, Mei
    Liu, Lingling
    Pei, Yongtao
    Xie, Guojing
    Wen, Jinghua
    VISUAL COMPUTER, 2024, : 1303 - 1317
  • [34] Feature ensemble network for medical image segmentation with multi-scale atrous transformer
    Gai, Di
    Geng, Yuhan
    Huang, Xia
    Huang, Zheng
    Xiong, Xin
    Zhou, Ruihua
    Wang, Qi
    IET IMAGE PROCESSING, 2024, 18 (11) : 3082 - 3092
  • [35] Multi-Scale Feature Interaction Network for Remote Sensing Change Detection
    Zhang, Chong
    Zhang, Yonghong
    Lin, Haifeng
    REMOTE SENSING, 2023, 15 (11)
  • [36] RingMo-Lite: A Remote Sensing Lightweight Network With CNN-Transformer Hybrid Framework
    Wang, Yuelei
    Zhang, Ting
    Zhao, Liangjin
    Hu, Lin
    Wang, Zhechao
    Niu, Ziqing
    Cheng, Peirui
    Chen, Kaiqiang
    Zeng, Xuan
    Wang, Zhirui
    Wang, Hongqi
    Sun, Xian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 20
  • [37] Crop classification based on G-CNN using multi-scale remote sensing images
    Meng, Mengmeng
    Zhang, Kaixin
    Huang, Yabo
    Li, Ning
    Guo, Zhengwei
    Zhou, Zhimin
    REMOTE SENSING LETTERS, 2024, 15 (09) : 941 - 950
  • [38] EMR-HRNet: A Multi-Scale Feature Fusion Network for Landslide Segmentation from Remote Sensing Images
    Jin, Yuanhang
    Liu, Xiaosheng
    Huang, Xiaobin
    SENSORS, 2024, 24 (11)
  • [39] MCNet: A Multi-scale and Cascade Network for Semantic Segmentation of Remote Sensing Images
    Zhou, Yin
    Li, Tianyi
    Li, Xianju
    Feng, Ruyi
    WEB AND BIG DATA, PT II, APWEB-WAIM 2023, 2024, 14332 : 162 - 176
  • [40] Multi-scale attention fusion network for semantic segmentation of remote sensing images
    Wen, Zhiqiang
    Huang, Hongxu
    Liu, Shuai
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (24) : 7909 - 7926