Safety compliance checking of construction behaviors using visual question answering

被引：15

作者：

Ding, Yuexiong ^{[1
,2
]}

Liu, Muyang ^{[1
,2
]}

Luo, Xiaowei ^{[1
,2
]}

机构：

[1] City Univ Hong Kong, Dept Architecture & Civil Engn, Hong Kong, Peoples R China

[2] City Univ Hong Kong, Shenzhen Res Inst, Architecture & Civil Engn Res Ctr, Shenzhen, Peoples R China

来源：

AUTOMATION IN CONSTRUCTION | 2022年 / 144卷

关键词：

Construction safety management; Safety compliance checking; Visual reasoning; Visual question answering; Cross -modal model; Vision -and -language Transformer; FALLS;

D O I：

10.1016/j.autcon.2022.104580

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Unsafe construction behavior, one of the leading factors of accidents and casualties, can be reduced by strengthening construction inspection. However, current methods use either manual inspection or inefficient cross-modal models based on multiple backbone networks. To alleviate the problems, a "rule-question" trans-formation and annotation system is formulated, and the unsafe behavior detection is turned into a visual reasoning task: visual question answering (VQA). The VQA model is developed based on a vision-and-language Transformer, and the unsafe behavior could be identified based on the output answers. A dataset containing 16 safety rules and 2386 related construction images is used to fine-tune and validate the VQA model. The results show that the developed VQA model achieves an average recall of 0.81 at a faster reasoning speed. Finally, an applet for safety report generation is implemented to demonstrate the feasibility and practicability of the safety compliance checking based on VQA.

引用

页数：11

共 50 条

[41] Visual Question Answering with Question Representation Update (QRU)
Li, Ruiyu
Jia, Jiaya
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[42] Visual Question Answering on 360° Images
Chou, Shih-Han
Chao, Wei-Lun
Lai, Wei-Sheng
Sun, Min
Yang, Ming-Hsuan
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1596 - 1605
[43] Medical visual question answering: A survey
Lin, Zhihong
Zhang, Donghao
Tao, Qingyi
Shi, Danli
Haffari, Gholamreza
Wu, Qi
He, Mingguang
Ge, Zongyuan
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 143
[44] Chain of Reasoning for Visual Question Answering
Wu, Chenfei
Liu, Jinlai
Wang, Xiaojie
Dong, Xuan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[45] Visual Question Answering as Reading Comprehension
Li, Hui
Wang, Peng
Shen, Chunhua
van den Hengel, Anton
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6312 - 6321
[46] Revisiting Visual Question Answering Baselines
Jabri, Allan
Joulin, Armand
van der Maaten, Laurens
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 727 - 739
[47] Answer Distillation for Visual Question Answering
Fang, Zhiwei
Liu, Jing
Tang, Qu
Li, Yong
Lu, Hanqing
COMPUTER VISION - ACCV 2018, PT I, 2019, 11361 : 72 - 87
[48] iVQA: Inverse Visual Question Answering
Liu, Feng
Xiang, Tao
Hospedales, Timothy M.
Yang, Wankou
Sun, Changyin
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8611 - 8619
[49] VAQA: Visual Arabic Question Answering
Kamel, Sarah M. M.
Hassan, Shimaa I. I.
Elrefaei, Lamiaa
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (08) : 10803 - 10823
[50] Adapted GooLeNet for Visual Question Answering
Huang, Jie
Hu, Yue
Yang, Weilong
2018 3RD INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE), 2018, : 603 - 606

← 1 2 3 4 5 →