CWPR: An optimized transformer-based model for construction worker pose estimation on construction robots

被引:0
|
作者
Zhou, Jiakai [1 ]
Zhou, Wanlin [1 ]
Wang, Yang [2 ,3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Mech & Elect Engn, Nanjing 210000, Peoples R China
[2] Anhui Univ Technol, Sch Mech Engn, Maanshan 243000, Peoples R China
[3] Anhui Prov Key Lab Special Heavy Load Robot, Maanshan 243000, Peoples R China
关键词
Construction worker pose; Construction robots; Transformer; Multi-human pose estimation; SURVEILLANCE VIDEOS; RECOGNITION;
D O I
10.1016/j.aei.2024.102894
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimating construction workers' poses is critically important for recognizing unsafe behaviors, conducting ergonomic analyses, and assessing productivity. Recently, utilizing construction robots to capture RGB images for pose estimation offers flexible monitoring perspectives and timely interventions. However, existing multi- human pose estimation (MHPE) methods struggle to balance accuracy and speed, making them unsuitable for real-time applications on construction robots. This paper introduces the Construction Worker Pose Recognizer (CWPR), an optimized Transformer-based MHPE model tailored for construction robots. Specifically, CWPR utilizes a lightweight encoder equipped with a multi-scale feature fusion module to enhance operational speed. Then, an Intersection over Union (IoU)-aware query selection strategy is employed to provide high- quality initial queries for the hybrid decoder, significantly improving performance. Besides, a decoder denoising module is used to incorporate noisy ground truth into the decoder, mitigating sample imbalance and further improving accuracy. Additionally, the Construction Worker Pose and Action (CWPA) dataset is collected from 154 videos captured in real construction scenarios. The dataset is annotated for different tasks: a pose benchmark for MHPE and an action benchmark for action recognition. Experiments demonstrate that CWPR achieves top-level accuracy and the fastest inference speed, attaining 68.1 Average Precision (AP) with a processing time of 26 ms on the COCO test set and 76.2 AP with 21 ms on the CWPA pose benchmark. Moreover, when integrated with the action recognition method ST-GCN on construction robot hardware, CWPR achieves 78.7 AP and a processing time of 19 ms on the CWPA action benchmark, validating its effectiveness for practical deployment.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Construction path tracking and pose estimation of unmanned bulldozer
    Peng, Gang
    Duan, Hangqi
    Tan, Zejie
    Zhou, Yicheng
    Li, Jianfeng
    Hu, Bin
    Zhou, Cheng
    AUTOMATION IN CONSTRUCTION, 2023, 154
  • [42] Human pose estimation in complex background videos via Transformer-based multi-scale feature integration
    Cheng, Chen
    Xu, Huahu
    DISPLAYS, 2024, 84
  • [43] MetaFi plus plus : WiFi-Enabled Transformer-Based Human Pose Estimation for Metaverse Avatar Simulation
    Zhou, Yunjiao
    Huang, He
    Yuan, Shenghai
    Zou, Han
    Xie, Lihua
    Yang, Jianfei
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (16) : 14128 - 14136
  • [44] YOLOPose: Transformer-Based Multi-object 6D Pose Estimation Using Keypoint Regression
    Amini, Arash
    Periyasamy, Arul Selvam
    Behnke, Sven
    INTELLIGENT AUTONOMOUS SYSTEMS 17, IAS-17, 2023, 577 : 392 - 406
  • [45] Face Pose Estimation and Tracking Using Automatic 3D Model Construction
    Jimenez, Pedro
    Nuevo, Jesus
    Bergasa, Luis M.
    2008 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, VOLS 1-3, 2008, : 765 - 771
  • [46] Wearable Robotics, Industrial Robots and Construction Worker's Safety and Health
    Li, Rita Yi Man
    Ng, Daniel Ping Lung
    ADVANCES IN HUMAN FACTORS IN ROBOTS AND UNMANNED SYSTEMS, 2018, 595 : 31 - 36
  • [47] A Transformer-Based Channel Estimation Method for OTFS Systems
    Sun, Teng
    Lv, Jiebiao
    Zhou, Tao
    ENTROPY, 2023, 25 (10)
  • [48] TRANSFORMER-BASED ESTIMATION OF SPOKEN SENTENCES USING ELECTROCORTICOGRAPHY
    Komeiji, Shuji
    Shigemi, Kai
    Mitsuhashi, Takumi
    Iimura, Yasushi
    Suzuki, Hiroharu
    Sugano, Hidenori
    Shinoda, Koichi
    Tanaka, Toshihisa
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1311 - 1315
  • [49] DDETR-SLAM: A TRANSFORMER-BASED APPROACH TO POSE OPTIMISATION IN DYNAMIC ENVIRONMENTS
    Li, Feng
    Liu, Yuanyuan
    Zhang, Kelong
    Hu, Zhengpeng
    Zhang, Guozheng
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2024, 39 (05): : 407 - 421
  • [50] Towards a Transformer-Based Reverse Dictionary Model for Quality Estimation of Definitions (Student Abstract)
    Guite-Vinet, Julien
    Masse, Alexandre Blondin
    Sadat, Fatiha
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23508 - 23509