Safety compliance checking of construction behaviors using visual question answering

被引:15
|
作者
Ding, Yuexiong [1 ,2 ]
Liu, Muyang [1 ,2 ]
Luo, Xiaowei [1 ,2 ]
机构
[1] City Univ Hong Kong, Dept Architecture & Civil Engn, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Shenzhen Res Inst, Architecture & Civil Engn Res Ctr, Shenzhen, Peoples R China
关键词
Construction safety management; Safety compliance checking; Visual reasoning; Visual question answering; Cross -modal model; Vision -and -language Transformer; FALLS;
D O I
10.1016/j.autcon.2022.104580
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Unsafe construction behavior, one of the leading factors of accidents and casualties, can be reduced by strengthening construction inspection. However, current methods use either manual inspection or inefficient cross-modal models based on multiple backbone networks. To alleviate the problems, a "rule-question" trans-formation and annotation system is formulated, and the unsafe behavior detection is turned into a visual reasoning task: visual question answering (VQA). The VQA model is developed based on a vision-and-language Transformer, and the unsafe behavior could be identified based on the output answers. A dataset containing 16 safety rules and 2386 related construction images is used to fine-tune and validate the VQA model. The results show that the developed VQA model achieves an average recall of 0.81 at a faster reasoning speed. Finally, an applet for safety report generation is implemented to demonstrate the feasibility and practicability of the safety compliance checking based on VQA.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Computational construction grammar for visual question answering
    Nevens, Jens
    Van Eecke, Paul
    Beuls, Katrien
    LINGUISTICS VANGUARD, 2019, 5 (01):
  • [2] Visual Question Answering-based Referring Expression Segmentation for construction safety analysis
    Tran, Dai Quoc
    Aboah, Armstrong
    Jeon, Yuntae
    Do, Minh-Truyen
    Abdel-Aty, Mohamed
    Park, Minsoo
    Park, Seunghee
    AUTOMATION IN CONSTRUCTION, 2025, 174
  • [3] Visual Question Answering using Explicit Visual Attention
    Lioutas, Vasileios
    Passalis, Nikolaos
    Tefas, Anastasios
    2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [4] Enhancing Visual Question Answering Using Dropout
    Fang, Zhiwei
    Liu, Jing
    Qiao, Yanyuan
    Tang, Qu
    Li, Yong
    Lu, Hanqing
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 1002 - 1010
  • [5] Biometric surveillance using visual question answering
    Toor, Andeep S.
    Wechsler, Harry
    Nappi, Michele
    PATTERN RECOGNITION LETTERS, 2019, 126 : 111 - 118
  • [6] Visual Question Answering
    Nada, Ahmed
    Chen, Min
    2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 6 - 10
  • [7] Question Modifiers in Visual Question Answering
    Britton, William
    Sarkhel, Somdeb
    Venugopal, Deepak
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1472 - 1479
  • [8] COREQQA: A COmpliance REQuirements Understanding using Question Answering Tool
    Abualhaija, Sallam
    Arora, Chetan
    Briand, Lionel C.
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 1682 - 1686
  • [9] Improving visual question answering using dropout and enhanced question encoder
    Fang, Zhiwei
    Liu, Jing
    Li, Yong
    Qiao, Yanyuan
    Lu, Hanqing
    PATTERN RECOGNITION, 2019, 90 : 404 - 414
  • [10] AnswerFact: Fact Checking in Product Question Answering
    Zhang, Wenxuan
    Yang Deng
    Jing Ma
    Wai Lam
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2407 - 2417