Road crack detection is crucial for maintaining and inspecting civil infrastructure, as cracks can pose a potential risk for sustainable road safety. Traditional methods for pavement crack detection are labour-intensive and time-consuming. In recent years, computer vision approaches have shown encouraging results in automating crack localization. However, the classical convolutional neural network (CNN)-based approach lacks global attention to the spatial features. To improve the crack localization in the road, we designed a vision transformer (ViT) and convolutional neural networks (CNNs)-based encoder and decoder. In addition, a gated-attention module in the decoder is designed to focus on the upsampling process. Furthermore, we proposed a hybrid loss function using binary cross-entropy and Dice loss to evaluate the model's effectiveness. Our method achieved a recall, F1-score, and IoU of 98.54%, 98.07%, and 98.72% and 98.27%, 98.69%, and 98.76% on the Crack500 and Crack datasets, respectively. Meanwhile, on the proposed dataset, these figures were 96.89%, 97.20%, and 97.36%.
机构:
Canales y Puertos Conselleria de Infraestructuras, Territorio y Medio Ambiente Generalitat, 46009 Valencia, SpainCanales y Puertos Conselleria de Infraestructuras, Territorio y Medio Ambiente Generalitat, 46009 Valencia, Spain
机构:
Univ Hong Kong, Dept Geog, Hong Kong, Peoples R China
Jiangxi Normal Univ, Sch Geog & Environm, Nanchang, Peoples R ChinaUniv Hong Kong, Dept Geog, Hong Kong, Peoples R China
Loo, Becky P. Y.
Tsoi, Ka Ho
论文数: 0引用数: 0
h-index: 0
机构:
Univ Hong Kong, Dept Geog, Hong Kong, Peoples R China
Univ Hong Kong, Dept Geog, Pokfulam, Room 1025,10-F Jockey Club Tower, Hong Kong, Peoples R ChinaUniv Hong Kong, Dept Geog, Hong Kong, Peoples R China