Multiscale feature fusion network for monocular complex hand pose estimation

被引:0
|
作者
Zhan, Zhi [1 ]
Luo, Guang [2 ]
机构
[1] Guangdong Engn Polytech, Zhan Zhi, Peoples R China
[2] South China Normal Univ, Luo Guang, Peoples R China
基金
中国国家自然科学基金;
关键词
feature extraction; learning (artificial intelligence); pose estimation;
D O I
10.1049/ell2.13044
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Hand pose estimation based on a single RGB image has low accuracy due to the complexity of the pose, local self-similarity of finger features, and occlusion. A multiscale feature fusion network (MS-FF) for monocular vision gesture pose estimation is proposed to address this problem. The network can take full advantage of different channel information to enhance important gesture information, and it can simultaneously extract features from feature maps of different resolutions to obtain as much detailed feature information and deep semantic information as possible. The feature maps are merged to obtain the hand pose results. The InterHand2.6M dataset and Rendered Handpose Dataset (RHD) are used to train the MS-FF. Compared with the other methods, the MS-FF obtains the smallest average error of hand joints, verifying its effectiveness. The authors proposed an MS-FF for monocular visual hand pose estimation. To effectively process the detailed information of occluded edges and fingertips, the network can extract information of different levels from feature maps of different resolutions to more accurately estimate hand poses. A channel conversion module adjusts the weights of channels. To make full use of both the edge detail characteristics of the images and deep semantic information, a global regression module fuses feature maps of different resolutions. An optimization procedure corrects some joints that are not returned to the correct position. Higher accuracy and robustness were achieved using the proposed method. Experiments verified the effectiveness of the MS-FF.image
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Efficient Multimodal Fusion for Hand Pose Estimation With Hourglass Network
    Hoang, Dinh-Cuong
    Xuan Tan, Phan
    Pham, Duc-Long
    Pham, Hai-Nam
    Bui, Son-Anh
    Nguyen, Chi-Minh
    Phi, An-Binh
    Tran, Khanh-Duong
    Trinh, Viet-Anh
    Tran, van-Duc
    Tran, Duc-Thanh
    Duong, van-Hiep
    Phan, Khanh-Toan
    Nguyen, van-Thiep
    Vu, van-Duc
    Nguyen, Thu-Uyen
    [J]. IEEE ACCESS, 2024, 12 : 113810 - 113825
  • [2] A multi-branch hand pose estimation network with joint-wise feature extraction and fusion
    Li, Xuefeng
    Zhou, Yidan
    Sun, Yi
    Lin, Xiangbo
    Ma, Xiaohong
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 81
  • [3] 3D Hand Pose Estimation From Monocular RGB With Feature Interaction Module
    Guo, Shaoxiang
    Rigall, Eric
    Ju, Yakun
    Dong, Junyu
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (08) : 5293 - 5306
  • [4] A variational approach to monocular hand-pose estimation
    de La Gorce, Martin
    Paragios, Nikos
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (03) : 363 - 372
  • [5] A Global Pose and Relative Pose Fusion Network for Monocular Visual Odometry
    Su, Bo
    Zang, Tianxiang
    [J]. IEEE ACCESS, 2024, 12 : 108863 - 108875
  • [6] LiteHandNet: A Lightweight Hand Pose Estimation Network via Structural Feature Enhancement
    Huang, Zhi-Yong
    Chen, Song-Lu
    Liu, Qi
    Zhang, Chong-Jian
    Chen, Feng
    Yin, Xu-Cheng
    [J]. MULTIMEDIA MODELING, MMM 2023, PT I, 2023, 13833 : 321 - 333
  • [7] FastHand: Fast monocular hand pose estimation on embedded systems
    An, Shan
    Zhang, Xiajie
    Wei, Dong
    Zhu, Haogang
    Yang, Jianyu
    Tsintotas, Konstantinos A.
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 122
  • [8] Monocular Depth Estimation Using Multi Scale Neural Network And Feature Fusion
    Sagar, Abhinav
    [J]. 2022 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2022), 2022, : 656 - 662
  • [9] HandFoldingNet: A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton
    Cheng, Wencan
    Park, Jae Hyun
    Ko, Jong Hwan
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11240 - 11249
  • [10] SDFPoseGraphNet: Spatial Deep Feature Pose Graph Network for 2D Hand Pose Estimation
    Salman, Sartaj Ahmed
    Zakir, Ali
    Takahashi, Hiroki
    [J]. SENSORS, 2023, 23 (22)