RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video

被引:65
|
作者
Wang, Jiayi [1 ]
Mueller, Franziska [1 ]
Bernard, Florian [2 ]
Sorli, Suzanne [3 ]
Sotnychenko, Oleksandr [1 ]
Qian, Neng [1 ]
Otaduy, Miguel A. [3 ]
Casas, Dan [3 ]
Theobalt, Christian [1 ]
机构
[1] MPI Informat, Saarland Informat Campus, Munich, Germany
[2] Tech Univ Munich, MPI Informat, Saarland Informat Campus, Munich, Germany
[3] Univ Rey Juan Carlos, Mostoles, Spain
来源
ACM TRANSACTIONS ON GRAPHICS | 2020年 / 39卷 / 06期
基金
欧洲研究理事会;
关键词
hand tracking; hand pose estimation; hand reconstruction; two hands; monocular RGB; RGB video; computer vision;
D O I
10.1145/3414685.3417852
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Tracking and reconstructing the 3D pose and geometry of two hands in interaction is a challenging problem that has a high relevance for several human-computer interaction applications, including AR/VR, robotics, or sign language recognition. Existing works are either limited to simpler tracking settings (e.g., considering only a single hand or two spatially separated hands), or rely on less ubiquitous sensors, such as depth cameras. In contrast, in this work we present the first real-time method for motion capture of skeletal pose and 3D surface geometry of hands from a single RGB camera that explicitly considers close interactions. In order to address the inherent depth ambiguities in RGB data, we propose a novel multi-task CNN that regresses multiple complementary pieces of information, including segmentation, dense matchings to a 3D hand model, and 2D keypoint positions, together with newly proposed infra-hand relative depth and inter-hand distance maps. These predictions are subsequently used in a generative model fitting framework in order to estimate pose and shape parameters of a 3D hand model for both hands. We experimentally verify the individual components of our RGB two-hand tracking and 3D reconstruction pipeline through an extensive ablation study. Moreover, we demonstrate that our approach offers previously unseen two-hand tracking performance from RGB, and quantitatively and qualitatively outperforms existing RGB-based methods that were not explicitly designed for two-hand interactions. Moreover, our method even performs on-par with depth-based real-time methods.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
    Mueller, Franziska
    Bernard, Florian
    Sotnychenko, Oleksandr
    Mehta, Dushyant
    Sridhar, Srinath
    Casas, Dan
    Theobalt, Christian
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 49 - 59
  • [2] Real-time 3D human pose and motion reconstruction from monocular RGB videos
    Yiannakides, Anastasios
    Aristidou, Andreas
    Chrysanthou, Yiorgos
    [J]. COMPUTER ANIMATION AND VIRTUAL WORLDS, 2019, 30 (3-4)
  • [3] Real-Time Facial Expression Transformation for Monocular RGB Video
    Ma, L.
    Deng, Z.
    [J]. COMPUTER GRAPHICS FORUM, 2019, 38 (01) : 470 - 481
  • [4] 3D hand mesh reconstruction from a monocular RGB image
    Hao Peng
    Chuhua Xian
    Yunbo Zhang
    [J]. The Visual Computer, 2020, 36 : 2227 - 2239
  • [5] 3D hand mesh reconstruction from a monocular RGB image
    Peng, Hao
    Xian, Chuhua
    Zhang, Yunbo
    [J]. VISUAL COMPUTER, 2020, 36 (10-12): : 2227 - 2239
  • [6] Real-time tracking of 3D elastic objects with an RGB-D sensor
    Petit, Antoine
    Lippiello, Vincenzo
    Siciliano, Bruno
    [J]. 2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 3914 - 3921
  • [7] Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor
    Mueller, Franziska
    Mehta, Dushyant
    Sotnychenko, Oleksandr
    Sridhar, Srinath
    Casas, Dan
    Theobalt, Christian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1163 - 1172
  • [8] Real-Time Joint Tracking of a Hand Manipulating an Object from RGB-D Input
    Sridhar, Srinath
    Mueller, Franziska
    Zollhoefer, Michael
    Casas, Dan
    Oulasvirta, Antti
    Theobalt, Christian
    [J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 294 - 310
  • [9] Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor
    Mueller, Franziska
    Mehta, Dushyant
    Sotnychenko, Oleksandr
    Sridhar, Srinath
    Casas, Dan
    Theobalt, Christian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 1284 - 1293
  • [10] Automatic Video Segmentation and Object Tracking with Real-Time RGB-D Data
    Chen, I-Kuei
    Hsu, Szu-Lu
    Chi, Chung-Yu
    Chen, Liang-Gee
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2014, : 488 - 489