Building Precision: Efficient Encoder-Decoder Networks for Remote Sensing Based on Aerial RGB and LiDAR Data

被引:0
|
作者
Sulaiman, Muhammad [1 ]
Finnesand, Erik [1 ]
Farmanbar, Mina [1 ]
Belbachir, Ahmed Nabil [2 ]
Rong, Chunming [1 ,2 ]
机构
[1] Univ Stavanger, Dept Elect Engn & Comp Sci, N-4021 Stavanger, Norway
[2] NORCE Norwegian Res Ctr, N-5008 Bergen, Norway
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Building precision; deep learning; LiDAR; remote sensing; semantic segmentation; U-Net; context-transfer U-Net; CONVOLUTIONAL NEURAL-NETWORK; EXTRACTION; IMAGES;
D O I
10.1109/ACCESS.2024.3391416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Precision in building delineation plays a pivotal role in population data analysis, city management, policy making, and disaster management. Leveraging computer vision technologies, particularly deep learning models for semantic segmentation, has proven instrumental in achieving accurate automatic building segmentation in remote sensing applications. However, current state-of-the-art (SOTA) techniques are not optimized for precisely extracting building footprints and, specifically, boundaries of the building. This deficiency highlights the need to leverage Light Detection and Ranging (LiDAR) data in conjunction with aerial RGB and streamlined deep learning for improved precision. This work utilizes the MapAI dataset, which includes a variety of objects beyond buildings, such as trees, electricity lines, solar panels, vehicles, and roads. These objects showcase diverse colors and structures, mirroring the rooftops in Denmark and Norway. Due to the aforementioned problems, this study modified UNet and CT-UNet to use LiDAR data and RGB images to segment buildings using Intersection Over Union (IoU) to evaluate building overlap and Boundary Intersection Over Union (BIoU) to evaluate precise building boundaries and shapes. The proposed work changes the configuration of these networks to streamline with LiDAR data for efficient segmentation. The batch data in training is augmented to improve model generalization and overcome overfitting. Batch normalization inclusion also improves overfitting. Four backbones with transfer learning are employed to enhance convergence and parameter efficiency of segmentation: ResNet50V2, DenseNet201, EfficientNetB4, and EfficientNetV2S. Test-Time Augmentation (TTA) is employed to improve the predicted mask. Experiments are performed using single and ensemble models, with and without Augmentation. The ensemble model outperforms the single model, and TTA also improves the results. LiDAR data with RGB improves the combined score (average of IoU and BIoU) by 13.33% compared to only RGB images.
引用
收藏
页码:60329 / 60346
页数:18
相关论文
共 50 条
  • [21] Attention-based encoder-decoder networks for workflow recognition
    Min Zhang
    Haiyang Hu
    Zhongjin Li
    Jie Chen
    [J]. Multimedia Tools and Applications, 2021, 80 : 34973 - 34995
  • [22] Semantic Segmentation of Remote Sensing Image Based on Multi-Scale Semantic Encoder-Decoder Network
    Liang Y.
    Yi C.-X.
    Wang G.-Y.
    Hu Y.-H.
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (11): : 3199 - 3214
  • [23] Asymmetric Encoder-Decoder Structured FCN Based LiDAR to Color Image Generation
    Kim, Hyun-Koo
    Yoo, Kook-Yeol
    Park, Ju H.
    Jung, Ho-Youl
    [J]. SENSORS, 2019, 19 (21)
  • [24] A Dual-attention Capsule Encoder-Decoder Network for Building Extraction from High Resolution Remote Sensing Imagery
    Xu Z.
    Guan H.
    Peng D.
    Yu Y.
    Lei X.
    Zhao H.
    [J]. National Remote Sensing Bulletin, 2022, 26 (08) : 1639 - 1649
  • [25] Using Neural Encoder-Decoder Models With Continuous Outputs for Remote Sensing Image Captioning
    Ramos, Rita
    Martins, Bruno
    [J]. IEEE ACCESS, 2022, 10 : 24852 - 24863
  • [26] MAENet: Multiple Attention Encoder-Decoder Network for Farmland Segmentation of Remote Sensing Images
    Huan, Hai
    Liu, Yuan
    Xie, Yaqin
    Wang, Chao
    Xu, Dongdong
    Zhang, Yi
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [28] Compute, Time and Energy Characterization of Encoder-Decoder Networks with Automatic Mixed Precision Training
    Samsi, Siddharth
    Jones, Michael
    Veillette, Mark M.
    [J]. 2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [29] DeepCEDNet: An Efficient Deep Convolutional Encoder-Decoder Networks for ECG Signal Enhancement
    Bing, Pingping
    Liu, Wei
    Zhang, Zhihua
    [J]. IEEE ACCESS, 2021, 9 : 56699 - 56708
  • [30] Ground-Based Remote Sensing Cloud Detection Using Dual Pyramid Network and Encoder-Decoder Constraint
    Zhang, Zhong
    Yang, Shuzhen
    Liu, Shuang
    Cao, Xiaozhong
    Durrani, Tariq S.
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60