SADNet: Generating immersive virtual reality avatars by real-time monocular pose estimation

被引:0
|
作者
Jiang, Ling [1 ]
Xiong, Yuan [1 ]
Wang, Qianqian [1 ]
Chen, Tong [1 ]
Wu, Wei [1 ]
Zhou, Zhong [1 ,2 ,3 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China
[2] Zhongguancun Lab, Beijing, Peoples R China
[3] Beihang Univ, POB 6863,37 Xueyuan Rd, Beijing, Peoples R China
关键词
3D avatar; computer animation; human pose estimation;
D O I
10.1002/cav.2233
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Generating immersive virtual reality avatars is a challenging task in VR/AR applications, which maps physical human body poses to avatars in virtual scenes for an immersive user experience. However, most existing work is time-consuming and limited by datasets, which does not satisfy immersive and real-time requirements of VR systems. In this paper, we aim to generate 3D real-time virtual reality avatars based on a monocular camera to solve these problems. Specifically, we first design a self-attention distillation network (SADNet) for effective human pose estimation, which is guided by a pre-trained teacher. Secondly, we propose a lightweight pose mapping method for human avatars that utilizes the camera model to map 2D poses to 3D avatar keypoints, generating real-time human avatars with pose consistency. Finally, we integrate our framework into a VR system, displaying generated 3D pose-driven avatars on Helmet-Mounted Display devices for an immersive user experience. We evaluate SADNet on two publicly available datasets. Experimental results show that SADNet achieves a state-of-the-art trade-off between speed and accuracy. In addition, we conducted a user experience study on the performance and immersion of virtual reality avatars. Results show that pose-driven 3D human avatars generated by our method are smooth and attractive. Generating immersive virtual reality avatars is a challenging task in VR/AR applications, which maps physical human body poses to avatars in virtual scenes for an immersive user experience. However, most existing work is time-consuming and limited by datasets, which does not satisfy immersive and real-time requirements of VR systems. In this paper, we aim to generate 3D real-time virtual reality avatars based on a monocular camera to solve these problems. Specifically, we first design a self-attention distillation network (SADNet) for effective human pose estimation, which is guided by a pre-trained teacher. Secondly, we propose a lightweight pose mapping method for human avatars that utilizes the camera model to map 2D poses to 3D avatar keypoints, generating real-time human avatars with pose consistency. Finally, we integrate our framework into a VR system, displaying generated 3D pose-driven avatars on Helmet-Mounted Display devices for an immersive user experience. We evaluate SADNet on two publicly available datasets. Experimental results show that SADNet achieves a state-of-the-art trade-off between speed and accuracy. In addition, we conducted a user experience study on the performance and immersion of virtual reality avatars. Results show that pose-driven 3D human avatars generated by our method are smooth and attractive. image
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A Real-Time Hand Pose Estimation System with Retrieval
    Hou, Guangdong
    Cui, Runpeng
    Zhang, Changshui
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 1738 - 1744
  • [22] Real-time camera pose estimation for sports fields
    Leonardo Citraro
    Pablo Márquez-Neila
    Stefano Savarè
    Vivek Jayaram
    Charles Dubout
    Félix Renaut
    Andrés Hasfura
    Horesh Ben Shitrit
    Pascal Fua
    Machine Vision and Applications, 2020, 31
  • [23] Accurate, robust, and real-time pose estimation of finger
    Department of Mechanical Engineering, University of Texas at Austin, Austin
    TX
    78712, United States
    J Dyn Syst Meas Control Trans ASME, 3
  • [24] Real-Time Pose Estimation Using Constrained Dynamics
    Bakken, Rune Havnung
    Hilton, Adrian
    ARTICULATED MOTION AND DEFORMABLE OBJECTS, 2012, 7378 : 37 - 46
  • [25] Accurate, robust, and real-time pose estimation of finger
    Department of Mechanical Engineering, University of Texas at Austin, Austin
    TX
    78712, United States
    J Dyn Syst Meas Control Trans ASME, 3
  • [26] Real-Time Hand Pose Estimation Using Classifiers
    Polrola, Mateusz
    Wojciechowski, Adam
    COMPUTER VISION AND GRAPHICS, 2012, 7594 : 573 - 580
  • [27] Real-Time Articulated Hand Detection and Pose Estimation
    Panin, Giorgio
    Klose, Sebastian
    Knoll, Alois
    ADVANCES IN VISUAL COMPUTING, PT 2, PROCEEDINGS, 2009, 5876 : 1131 - 1140
  • [28] Accurate, robust, and real-time pose estimation of finger
    Department of Mechanical Engineering, University of Texas at Austin, Austin
    TX
    78712, United States
    J Dyn Syst Meas Control Trans ASME, 3
  • [29] Real-Time Reinforcement Learning for Optimal Viewpoint Selection in Monocular 3D Human Pose Estimation
    Lee, Sanghyeon
    Hwang, Yoonho
    Taek Lee, Jong
    IEEE Access, 2024, 12 : 191020 - 191029
  • [30] Real-Time Face Pose Estimation in Challenging Environments
    Hazar, Mliki
    Mohamed, Hammami
    Hanene, Ben-Abdallah
    ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, ACIVS 2013, 2013, 8192 : 114 - 125