Efficient Depth Fusion Transformer for Aerial Image Semantic Segmentation

被引:16
|
作者
Yan, Li [1 ,2 ]
Huang, Jianming [1 ]
Xie, Hong [1 ]
Wei, Pengcheng [1 ]
Gao, Zhao [2 ]
机构
[1] Wuhan Univ, Sch Geodesy & Geomat, Wuhan 430079, Peoples R China
[2] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
关键词
semantic segmentation; self-attention; depth fusion; transformer; RESOLUTION; RGB;
D O I
10.3390/rs14051294
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Taking depth into consideration has been proven to improve the performance of semantic segmentation through providing additional geometry information. Most existing works adopt a two-stream network, extracting features from color images and depth images separately using two branches of the same structure, which suffer from high memory and computation costs. We find that depth features acquired by simple downsampling can also play a complementary part in the semantic segmentation task, sometimes even better than the two-stream scheme with the same two branches. In this paper, a novel and efficient depth fusion transformer network for aerial image segmentation is proposed. The presented network utilizes patch merging to downsample depth input and a depth-aware self-attention (DSA) module is designed to mitigate the gap caused by difference between two branches and two modalities. Concretely, the DSA fuses depth features and color features by computing depth similarity and impact on self-attention map calculated by color feature. Extensive experiments on the ISPRS 2D semantic segmentation dataset validate the efficiency and effectiveness of our method. With nearly half the parameters of traditional two-stream scheme, our method acquires 83.82% mIoU on Vaihingen dataset outperforming other state-of-the-art methods and 87.43% mIoU on Potsdam dataset comparable to the state-of-the-art.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Image Semantic Segmentation Based on Depth Parallel Convolutional Networks
    Qin, Zi-yang
    [J]. 2018 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND MECHATRONICS ENGINEERING (CCME 2018), 2018, 332 : 239 - 243
  • [42] Scale-equivariant convolution for semantic segmentation of depth image
    Marumo, Hidetaka
    Matsubara, Takashi
    [J]. IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2024, 15 (01): : 36 - 53
  • [43] Multi - Feature Fusion Aerial Image Segmentation in Complex Background
    Yang, Rui
    Qian, Xiao Jun
    Zhang, Bing Bing
    [J]. ICVISP 2019: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING, 2019,
  • [44] Aerial image semantic segmentation using DCNN predicted distance maps
    Chai, Dengfeng
    Newsam, Shawn
    Huang, Jingfeng
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 161 : 309 - 322
  • [45] Entropy Guided Adversarial Domain Adaptation for Aerial Image Semantic Segmentation
    Zheng, Aihua
    Wang, Ming
    Li, Chenglong
    Tang, Jin
    Luo, Bin
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [46] SAR AND OBLIQUE AERIAL OPTICAL IMAGE FUSION FOR URBAN AREA IMAGE SEGMENTATION
    Fagir, Julian
    Schubert, Adrian
    Frioud, Max
    Henke, Daniel
    [J]. ISPRS HANNOVER WORKSHOP: HRIGI 17 - CMRT 17 - ISA 17 - EUROCOW 17, 2017, 42-1 (W1): : 639 - 642
  • [47] THE EFFECT OF FOCAL LOSS IN SEMANTIC SEGMENTATION OF HIGH RESOLUTION AERIAL IMAGE
    Doi, Kento
    Iwasaki, Akira
    [J]. IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 6919 - 6922
  • [48] Combining Swin Transformer With UNet for Remote Sensing Image Semantic Segmentation
    Fan, Lili
    Zhou, Yu
    Liu, Hongmei
    Li, Yunjie
    Cao, Dongpu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
  • [49] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
    He, Xin
    Zhou, Yong
    Zhao, Jiaqi
    Zhang, Di
    Yao, Rui
    Xue, Yong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [50] Efficient image segmentation preserving semantic object shapes
    Park, HS
    Beom, J
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1999, E82A (06) : 879 - 886