A hybrid network for estimating 3D interacting hand pose from a single RGB image

被引:0
|
作者
Bao, Wenxia [1 ]
Gao, Qiuyue [1 ]
Yang, Xianjun [2 ]
机构
[1] Anhui Univ, Sch Elect & Informat Engn, Hefei 230601, Anhui, Peoples R China
[2] Chinese Acad Sci, Hefei Inst Phys Sci, Hefei 230031, Anhui, Peoples R China
关键词
3D hand pose estimation; Interacting Hand; Hybrid network; End to end network;
D O I
10.1007/s11760-024-03043-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The estimation of 3D interacting hand pose from a single RGB image is a challenging problem. The hands tend to occlude each other and are self-similar in two-handed interactions. In this study, a simple, accurate end-to-end framework called HybridPoseNet is proposed for estimating 3D interactive hand pose. The hybrid network employs an encoder-decoder architecture. More specifically, the feature encoder is a hybrid structure that combines a convolutional neural network (CNN) with a transformer to accomplish the feature encoding of hand information. An ordinary CNN is employed to extract the local detailed features of a given image, and a vision transformer is used to capture the long-distance spatial interactions between the cross-positional feature vectors. Moreover, the 3D pose decoder is based on left and right network branches, which are fused via a feature enhancement module (FEM). The FEM helps reduce the ambiguity in appearance caused by the self-similarity of the hands. The decoder elevates the 2D pose to the 3D pose by estimating two depth components. The ablation experiments demonstrate the effectiveness of each module in the network. In addition, comprehensive experiments on the InterHand2.6M dataset show that the proposed method outperforms previous state-of-the-art methods for estimating interactive hand pose.
引用
收藏
页码:3801 / 3814
页数:14
相关论文
共 50 条
  • [41] HMTNet: 3D Hand Pose Estimation From Single Depth Image Based on Hand Morphological Topology
    Zhou, Weiguo
    Jiang, Xin
    Chen, Chen
    Mei, Sijia
    Liu, Yun-Hui
    IEEE SENSORS JOURNAL, 2020, 20 (11) : 6004 - 6011
  • [42] Structure-Aware 3D Hand Pose Regression from a Single Depth Image
    Malik, Jameel
    Elhayek, Ahmed
    Stricker, Didier
    VIRTUAL REALITY AND AUGMENTED REALITY, EUROVR 2018, 2018, 11162 : 3 - 17
  • [43] SAR: Spatial-Aware Regression for 3D Hand Pose and Mesh Reconstruction from a Monocular RGB Image
    Zheng, Xiaozheng
    Ren, Pengfei
    Sun, Haifeng
    Wang, Jingyu
    Qi, Qi
    Liao, Jianxin
    2021 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR 2021), 2021, : 99 - 108
  • [44] Reweighted sparse representation with residual compensation for 3D human pose estimation from a single RGB image
    Jiang, Mengxi
    Yu, Zhuliang
    Zhang, Yan
    Wang, Qicong
    Li, Cuihua
    Lei, Yunqi
    NEUROCOMPUTING, 2019, 358 : 332 - 343
  • [45] Keypoint Fusion for RGB-D Based 3D Hand Pose Estimation
    Liu, Xingyu
    Ren, Pengfei
    Gao, Yuanyuan
    Wang, Jingyu
    Sun, Haifeng
    Qi, Qi
    Zhuang, Zirui
    Liao, Jianxin
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3756 - 3764
  • [46] Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images
    Cai, Yujun
    Ge, Liuhao
    Cai, Jianfei
    Yuan, Junsong
    COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 : 678 - 694
  • [47] 3D Hand Pose Estimation from RGB Using Privileged Learning with Depth Data
    Yuan, Shanxin
    Stenger, Bjorn
    Kim, Tae-Kyun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2866 - 2873
  • [48] 3D hand reconstruction with both shape and appearance from an RGB image
    Chang, Xiaoyun
    Yi, Wentao
    Lin, Xiangbo
    Sun, Yi
    IMAGE AND VISION COMPUTING, 2023, 135
  • [49] Model-Based 3D Pose Estimation of a Single RGB Image Using a Deep Viewpoint Classification Neural Network
    Su, Jui-Yuan
    Cheng, Shyi-Chyi
    Chang, Chin-Chun
    Chen, Jing-Ming
    APPLIED SCIENCES-BASEL, 2019, 9 (12):
  • [50] 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal
    Meng, Hao
    Jin, Sheng
    Liu, Wentao
    Qian, Chen
    Lin, Mengxiang
    Ouyang, Wanli
    Luo, Ping
    COMPUTER VISION - ECCV 2022, PT VI, 2022, 13666 : 380 - 397