DIR-BHRNet: A Lightweight Network for Real-Time Vision-Based Multiperson Pose Estimation on Smartphones

被引:0
|
作者
Lan, Gongjin [1 ]
Wu, Yu [1 ]
Hao, Qi [1 ,2 ]
机构
[1] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen 518055, Peoples R China
[2] Southern Univ Sci & Technol, Res Inst Trustworthy Autonomous Syst, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; human pose estimation (HPE); multiperson pose estimation (MPPE); real time; smartphones;
D O I
10.1109/TII.2024.3421511
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human pose estimation (HPE), particularly multiperson pose estimation (MPPE), has been applied in many domains, such as human-machine systems. However, the current MPPE methods generally run on powerful GPU systems and take a lot of computational costs. Real-time MPPE on mobile devices with low-performance computing is a challenging task. In this article, we propose a lightweight neural network, DIR-BHRNet, for real-time MPPE on smartphones. In DIR-BHRNet, we design a novel lightweight convolutional module, dense inverted residual (DIR), to improve accuracy by adding a depthwise convolution and a shortcut connection into the well-known inverted residual, and a novel efficient neural network structure, balanced HRNet (BHRNet), to reduce computational costs by reconfiguring the proper number of convolutional blocks on each branch. We evaluate DIR-BHRNet on the well-known COCO and CrowdPose datasets. The results show that DIR-BHRNet outperforms the state-of-the-art methods in terms of accuracy with a real-time computational cost. Finally, we implement the DIR-BHRNet on the current mainstream Android smartphones, which perform more than 10 FPS. The free-used executable file (Android 10), source code, and a video description of this work are publicly available on the page(1) to facilitate the development of real-time MPPE on smartphones.
引用
收藏
页码:12533 / 12541
页数:9
相关论文
共 50 条
  • [1] Real-Time Vision-Based Chinese Sign Language Recognition with Pose Estimation and Attention Network
    Cheng, Sirui
    Huang, Chaorui
    Wang, Zhaohui
    Wang, Jiaxing
    Zeng, Zhen
    Wang, Fei
    Ding, Qichuan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE-ROBIO 2021), 2021, : 1210 - 1215
  • [2] A Robust Vision-Based Sensor Fusion Approach for Real-Time Pose Estimation
    Assa, Akbar
    Janabi-Sharifi, Farrokh
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (02) : 217 - 227
  • [3] Low Cost Vision-Based Real-Time Lane Recognition and Lateral Pose Estimation
    Tan, Sofyan
    Agnes
    Mae, Johannes
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND CYBERNETICS (CYBERNETICSCOM), 2013, : 151 - 154
  • [4] Vision based real-time pose estimation for intelligent vehicles
    Yang, M
    Yu, Q
    Wang, H
    Zhang, B
    [J]. 2004 IEEE INTELLIGENT VEHICLES SYMPOSIUM, 2004, : 262 - 267
  • [5] Lightweight Deep Neural Network-based Real-Time Pose Estimation on Embedded Systems
    Heo, Junho
    Kim, Ginam
    Park, Jaeseo
    Kim, Yeonsu
    Cho, Sung-Sik
    Lee, Chang Won
    Kang, Suk-Ju
    [J]. 2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 1066 - 1071
  • [6] An intelligent bulletin board system with real-time vision-based interaction using head pose estimation
    Chang, Cheng-Yu
    Chung, Pau-Choo
    Yeh, Yu-Sheng
    Yang, Jar-Ferr
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 1140 - +
  • [7] Real-time data fusion on stabilizing camera pose estimation output for vision-based road navigation
    Hu, ZC
    Uchimura, K
    [J]. STEREOSCOPIC DISPLAYS AND VIRTUAL REALITY SYSTEMS XI, 2004, 5291 : 480 - 490
  • [8] Vision-based Real-time Estimation of Smartphone Heading and Misalignment
    Kazemipur, Bashir
    Syed, Zainab
    Georgy, Jacques
    El-Sheimy, Naser
    [J]. PROCEEDINGS OF THE 26TH INTERNATIONAL TECHNICAL MEETING OF THE SATELLITE DIVISION OF THE INSTITUTE OF NAVIGATION (ION GNSS 2013), 2013, : 505 - 510
  • [9] Vision-based SLAM in real-time
    Davison, Andrew J.
    [J]. Pattern Recognition and Image Analysis, Pt 1, Proceedings, 2007, 4477 : 9 - 12
  • [10] Real-time 3D Skeletonisation in Computer Vision-Based Human Pose Estimation Using GPGPU
    Bakken, Rune Havnung
    Eliassen, Lars Moland
    [J]. 2012 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS, 2012, : 61 - 67