Building Precision: Efficient Encoder-Decoder Networks for Remote Sensing Based on Aerial RGB and LiDAR Data

被引:0
|
作者
Sulaiman, Muhammad [1 ]
Finnesand, Erik [1 ]
Farmanbar, Mina [1 ]
Belbachir, Ahmed Nabil [2 ]
Rong, Chunming [1 ,2 ]
机构
[1] Univ Stavanger, Dept Elect Engn & Comp Sci, N-4021 Stavanger, Norway
[2] NORCE Norwegian Res Ctr, N-5008 Bergen, Norway
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Building precision; deep learning; LiDAR; remote sensing; semantic segmentation; U-Net; context-transfer U-Net; CONVOLUTIONAL NEURAL-NETWORK; EXTRACTION; IMAGES;
D O I
10.1109/ACCESS.2024.3391416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Precision in building delineation plays a pivotal role in population data analysis, city management, policy making, and disaster management. Leveraging computer vision technologies, particularly deep learning models for semantic segmentation, has proven instrumental in achieving accurate automatic building segmentation in remote sensing applications. However, current state-of-the-art (SOTA) techniques are not optimized for precisely extracting building footprints and, specifically, boundaries of the building. This deficiency highlights the need to leverage Light Detection and Ranging (LiDAR) data in conjunction with aerial RGB and streamlined deep learning for improved precision. This work utilizes the MapAI dataset, which includes a variety of objects beyond buildings, such as trees, electricity lines, solar panels, vehicles, and roads. These objects showcase diverse colors and structures, mirroring the rooftops in Denmark and Norway. Due to the aforementioned problems, this study modified UNet and CT-UNet to use LiDAR data and RGB images to segment buildings using Intersection Over Union (IoU) to evaluate building overlap and Boundary Intersection Over Union (BIoU) to evaluate precise building boundaries and shapes. The proposed work changes the configuration of these networks to streamline with LiDAR data for efficient segmentation. The batch data in training is augmented to improve model generalization and overcome overfitting. Batch normalization inclusion also improves overfitting. Four backbones with transfer learning are employed to enhance convergence and parameter efficiency of segmentation: ResNet50V2, DenseNet201, EfficientNetB4, and EfficientNetV2S. Test-Time Augmentation (TTA) is employed to improve the predicted mask. Experiments are performed using single and ensemble models, with and without Augmentation. The ensemble model outperforms the single model, and TTA also improves the results. LiDAR data with RGB improves the combined score (average of IoU and BIoU) by 13.33% compared to only RGB images.
引用
收藏
页码:60329 / 60346
页数:18
相关论文
共 50 条
  • [1] Encoder-decoder structure based on conditional random field for building extraction in remote sensing images
    Xu, Yian
    [J]. EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2022, 9 (36):
  • [2] Detection of Building Change in Remote Sensing Image Based on Encoder-Decoder Network UNet3+
    Liang, Yan
    Yi, Chun-Xia
    Wang, Guang-Yu
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (08): : 1720 - 1732
  • [3] Remote sensing image denoising using residual encoder-decoder networks with edge enhancement
    结合残差编解码网络和边缘增强的遥感图像去噪
    [J]. Zhan, Shu (shu_zhan@hfut.edu.cn), 1600, Science Press (24): : 27 - 36
  • [4] An road extraction method for remote sensing image based on Encoder-Decoder network
    He, Hao
    Wang, Shicheng
    Yang, Dongfang
    Wang, Shuyang
    Liu, Xing
    [J]. Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2019, 48 (03): : 330 - 338
  • [5] Data Prediction Based Encoder-Decoder Learning in Wireless Sensor Networks
    Njoya, Arouna Ndam
    Tchangmena, Allassan A. Nken
    Ari, Ado Adamou Abba
    Gueroui, Abdelhak
    Thron, Christopher
    Mpinda, Berthine Nyunga
    Thiare, Ousmane
    Tonye, Emmanuel
    [J]. IEEE ACCESS, 2022, 10 : 109340 - 109356
  • [6] DENSIFICATION OF AIRBORNE LIDAR POINT CLOUD WITH FUSED ENCODER-DECODER NETWORKS
    Wang, Weimin
    Vinayaraj, Poliyapram
    Nakamura, Ryosuke
    [J]. IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 2655 - 2658
  • [7] Semantic Segmentation of Remote Sensing Image Based on Encoder-Decoder Convolutional Neural Network
    Zhang Zhehan
    Fang Wei
    Du Lili
    Qiao Yanli
    Zhang Dongying
    Ding Guoshen
    [J]. ACTA OPTICA SINICA, 2020, 40 (03)
  • [8] Light encoder-decoder network for road extraction of remote sensing images
    He, Hao
    Yang, Dongfang
    Wang, Shicheng
    Zheng, Yuhang
    Wang, Shuyang
    [J]. JOURNAL OF APPLIED REMOTE SENSING, 2019, 13 (03)
  • [9] Urban building extraction based on information fusion-oriented deep encoder-decoder network in remote sensing imagery
    Zhang, Cheng
    Ma, Mingzhou
    He, Dan
    [J]. MULTIAGENT AND GRID SYSTEMS, 2022, 18 (3-4) : 279 - 294
  • [10] An Attention Encoder-Decoder Network Based on Generative Adversarial Network for Remote Sensing Image Dehazing
    Zhao, Liquan
    Zhang, Yupeng
    Cui, Ying
    [J]. IEEE SENSORS JOURNAL, 2022, 22 (11) : 10890 - 10900