Efficient Depth Fusion Transformer for Aerial Image Semantic Segmentation

被引:16
|
作者
Yan, Li [1 ,2 ]
Huang, Jianming [1 ]
Xie, Hong [1 ]
Wei, Pengcheng [1 ]
Gao, Zhao [2 ]
机构
[1] Wuhan Univ, Sch Geodesy & Geomat, Wuhan 430079, Peoples R China
[2] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
关键词
semantic segmentation; self-attention; depth fusion; transformer; RESOLUTION; RGB;
D O I
10.3390/rs14051294
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Taking depth into consideration has been proven to improve the performance of semantic segmentation through providing additional geometry information. Most existing works adopt a two-stream network, extracting features from color images and depth images separately using two branches of the same structure, which suffer from high memory and computation costs. We find that depth features acquired by simple downsampling can also play a complementary part in the semantic segmentation task, sometimes even better than the two-stream scheme with the same two branches. In this paper, a novel and efficient depth fusion transformer network for aerial image segmentation is proposed. The presented network utilizes patch merging to downsample depth input and a depth-aware self-attention (DSA) module is designed to mitigate the gap caused by difference between two branches and two modalities. Concretely, the DSA fuses depth features and color features by computing depth similarity and impact on self-attention map calculated by color feature. Extensive experiments on the ISPRS 2D semantic segmentation dataset validate the efficiency and effectiveness of our method. With nearly half the parameters of traditional two-stream scheme, our method acquires 83.82% mIoU on Vaihingen dataset outperforming other state-of-the-art methods and 87.43% mIoU on Potsdam dataset comparable to the state-of-the-art.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Remote Sensing Image Semantic Segmentation Based on Cascaded Transformer
    Wang, Falin
    Ji, Jian
    Wang, Yuan
    [J]. IEEE Transactions on Artificial Intelligence, 2024, 5 (08): : 4136 - 4148
  • [32] TBFormer: three-branch efficient transformer for semantic segmentation
    Wei, Can
    Wei, Yan
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3661 - 3672
  • [33] Efficient 3D Semantic Segmentation with Superpoint Transformer
    Robert, Damien
    Raguet, Hugo
    Landrieu, Loic
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17149 - 17158
  • [34] TBFormer: three-branch efficient transformer for semantic segmentation
    Can Wei
    Yan Wei
    [J]. Signal, Image and Video Processing, 2024, 18 : 3661 - 3672
  • [35] Infrared and Visible Image Fusion Based on Semantic Segmentation
    Zhou, Huabing
    Hou, Jilei
    Wu, Wei
    Zhang, Yanduo
    Wu, Yuntao
    Ma, Jiayi
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2021, 58 (02): : 436 - 443
  • [36] Deep multimodal fusion for semantic image segmentation: A survey
    Zhang, Yifei
    Sidibe, Desire
    Morel, Olivier
    Meriaudeau, Fabrice
    [J]. IMAGE AND VISION COMPUTING, 2021, 105
  • [37] Semantic Depth Map Fusion for Moving Vehicle Detection in Aerial Video
    Poostchi, Mandieh
    Aliakbarpour, Hadi
    Viguier, Raphael
    Bunyak, Filiz
    Palaniappan, Kannappan
    Seetharaman, Guna
    [J]. PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 1575 - 1583
  • [38] Semantic Segmentation Guided Pixel Fusion for Image Retargeting
    Yan, Bo
    Niu, Xuejing
    Bare, Bahetiyaer
    Tan, Weimin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (03) : 676 - 687
  • [39] Insulator Semantic Segmentation in Aerial Images Based on Multiscale Feature Fusion
    Cui, Zheng
    Yang, Chunxi
    Wang, Sen
    [J]. COMPLEXITY, 2022, 2022
  • [40] Boundary refinement network with semantic embedding connections for UAV aerial image semantic segmentation
    Li, Runzeng
    Shi, Zaifeng
    Kong, Fanning
    Zhao, Xiangyang
    Luo, Tao
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (06)