Building Precision: Efficient Encoder-Decoder Networks for Remote Sensing Based on Aerial RGB and LiDAR Data

被引：0

作者：

Sulaiman, Muhammad ^{[1
]}

Finnesand, Erik ^{[1
]}

Farmanbar, Mina ^{[1
]}

Belbachir, Ahmed Nabil ^{[2
]}

Rong, Chunming ^{[1
,2
]}

机构：

[1] Univ Stavanger, Dept Elect Engn & Comp Sci, N-4021 Stavanger, Norway

[2] NORCE Norwegian Res Ctr, N-5008 Bergen, Norway

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Building precision; deep learning; LiDAR; remote sensing; semantic segmentation; U-Net; context-transfer U-Net; CONVOLUTIONAL NEURAL-NETWORK; EXTRACTION; IMAGES;

D O I：

10.1109/ACCESS.2024.3391416

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Precision in building delineation plays a pivotal role in population data analysis, city management, policy making, and disaster management. Leveraging computer vision technologies, particularly deep learning models for semantic segmentation, has proven instrumental in achieving accurate automatic building segmentation in remote sensing applications. However, current state-of-the-art (SOTA) techniques are not optimized for precisely extracting building footprints and, specifically, boundaries of the building. This deficiency highlights the need to leverage Light Detection and Ranging (LiDAR) data in conjunction with aerial RGB and streamlined deep learning for improved precision. This work utilizes the MapAI dataset, which includes a variety of objects beyond buildings, such as trees, electricity lines, solar panels, vehicles, and roads. These objects showcase diverse colors and structures, mirroring the rooftops in Denmark and Norway. Due to the aforementioned problems, this study modified UNet and CT-UNet to use LiDAR data and RGB images to segment buildings using Intersection Over Union (IoU) to evaluate building overlap and Boundary Intersection Over Union (BIoU) to evaluate precise building boundaries and shapes. The proposed work changes the configuration of these networks to streamline with LiDAR data for efficient segmentation. The batch data in training is augmented to improve model generalization and overcome overfitting. Batch normalization inclusion also improves overfitting. Four backbones with transfer learning are employed to enhance convergence and parameter efficiency of segmentation: ResNet50V2, DenseNet201, EfficientNetB4, and EfficientNetV2S. Test-Time Augmentation (TTA) is employed to improve the predicted mask. Experiments are performed using single and ensemble models, with and without Augmentation. The ensemble model outperforms the single model, and TTA also improves the results. LiDAR data with RGB improves the combined score (average of IoU and BIoU) by 13.33% compared to only RGB images.

引用

页码：60329 / 60346

页数：18

共 50 条

[21] Attention-based encoder-decoder networks for workflow recognition
Min Zhang
Haiyang Hu
Zhongjin Li
Jie Chen
[J]. Multimedia Tools and Applications, 2021, 80 : 34973 - 34995
[22] Semantic Segmentation of Remote Sensing Image Based on Multi-Scale Semantic Encoder-Decoder Network
Liang Y.
Yi C.-X.
Wang G.-Y.
Hu Y.-H.
[J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (11): : 3199 - 3214
[23] Asymmetric Encoder-Decoder Structured FCN Based LiDAR to Color Image Generation
Kim, Hyun-Koo
Yoo, Kook-Yeol
Park, Ju H.
Jung, Ho-Youl
[J]. SENSORS, 2019, 19 (21)
[24] A Dual-attention Capsule Encoder-Decoder Network for Building Extraction from High Resolution Remote Sensing Imagery
Xu Z.
Guan H.
Peng D.
Yu Y.
Lei X.
Zhao H.
[J]. National Remote Sensing Bulletin, 2022, 26 (08) : 1639 - 1649
[25] Using Neural Encoder-Decoder Models With Continuous Outputs for Remote Sensing Image Captioning
Ramos, Rita
Martins, Bruno
[J]. IEEE ACCESS, 2022, 10 : 24852 - 24863
[26] MAENet: Multiple Attention Encoder-Decoder Network for Farmland Segmentation of Remote Sensing Images
Huan, Hai
Liu, Yuan
Xie, Yaqin
Wang, Chao
Xu, Dongdong
Zhang, Yi
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[27] Potential Obstacle Detection Using RGB to Depth Image Encoder-Decoder Network: Application to Unmanned Aerial Vehicles
Hachaj, Tomasz
[J]. SENSORS, 2022, 22 (17)
[28] Compute, Time and Energy Characterization of Encoder-Decoder Networks with Automatic Mixed Precision Training
Samsi, Siddharth
Jones, Michael
Veillette, Mark M.
[J]. 2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
[29] DeepCEDNet: An Efficient Deep Convolutional Encoder-Decoder Networks for ECG Signal Enhancement
Bing, Pingping
Liu, Wei
Zhang, Zhihua
[J]. IEEE ACCESS, 2021, 9 : 56699 - 56708
[30] Ground-Based Remote Sensing Cloud Detection Using Dual Pyramid Network and Encoder-Decoder Constraint
Zhang, Zhong
Yang, Shuzhen
Liu, Shuang
Cao, Xiaozhong
Durrani, Tariq S.
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

← 1 2 3 4 5 →