SADNet: Generating immersive virtual reality avatars by real-time monocular pose estimation

被引:0
|
作者
Jiang, Ling [1 ]
Xiong, Yuan [1 ]
Wang, Qianqian [1 ]
Chen, Tong [1 ]
Wu, Wei [1 ]
Zhou, Zhong [1 ,2 ,3 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China
[2] Zhongguancun Lab, Beijing, Peoples R China
[3] Beihang Univ, POB 6863,37 Xueyuan Rd, Beijing, Peoples R China
关键词
3D avatar; computer animation; human pose estimation;
D O I
10.1002/cav.2233
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Generating immersive virtual reality avatars is a challenging task in VR/AR applications, which maps physical human body poses to avatars in virtual scenes for an immersive user experience. However, most existing work is time-consuming and limited by datasets, which does not satisfy immersive and real-time requirements of VR systems. In this paper, we aim to generate 3D real-time virtual reality avatars based on a monocular camera to solve these problems. Specifically, we first design a self-attention distillation network (SADNet) for effective human pose estimation, which is guided by a pre-trained teacher. Secondly, we propose a lightweight pose mapping method for human avatars that utilizes the camera model to map 2D poses to 3D avatar keypoints, generating real-time human avatars with pose consistency. Finally, we integrate our framework into a VR system, displaying generated 3D pose-driven avatars on Helmet-Mounted Display devices for an immersive user experience. We evaluate SADNet on two publicly available datasets. Experimental results show that SADNet achieves a state-of-the-art trade-off between speed and accuracy. In addition, we conducted a user experience study on the performance and immersion of virtual reality avatars. Results show that pose-driven 3D human avatars generated by our method are smooth and attractive. Generating immersive virtual reality avatars is a challenging task in VR/AR applications, which maps physical human body poses to avatars in virtual scenes for an immersive user experience. However, most existing work is time-consuming and limited by datasets, which does not satisfy immersive and real-time requirements of VR systems. In this paper, we aim to generate 3D real-time virtual reality avatars based on a monocular camera to solve these problems. Specifically, we first design a self-attention distillation network (SADNet) for effective human pose estimation, which is guided by a pre-trained teacher. Secondly, we propose a lightweight pose mapping method for human avatars that utilizes the camera model to map 2D poses to 3D avatar keypoints, generating real-time human avatars with pose consistency. Finally, we integrate our framework into a VR system, displaying generated 3D pose-driven avatars on Helmet-Mounted Display devices for an immersive user experience. We evaluate SADNet on two publicly available datasets. Experimental results show that SADNet achieves a state-of-the-art trade-off between speed and accuracy. In addition, we conducted a user experience study on the performance and immersion of virtual reality avatars. Results show that pose-driven 3D human avatars generated by our method are smooth and attractive. image
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Accurate, Robust, and Real-Time Pose Estimation of Finger
    Yun, Youngmok
    Agarwal, Priyanshu
    Deshpande, Ashish D.
    JOURNAL OF DYNAMIC SYSTEMS MEASUREMENT AND CONTROL-TRANSACTIONS OF THE ASME, 2015, 137 (03):
  • [32] Real-time camera pose estimation for sports fields
    Citraro, Leonardo
    Marquez-Neila, Pablo
    Savare, Stefano
    Jayaram, Vivek
    Dubout, Charles
    Renaut, Felix
    Hasfura, Andres
    Ben Shitrit, Horesh
    Fua, Pascal
    MACHINE VISION AND APPLICATIONS, 2020, 31 (03)
  • [33] Real-Time Head Pose Estimation on Mobile Devices
    Cheng, Zhengxin
    Bai, Fangyu
    COMPUTER VISION - ACCV 2016 WORKSHOPS, PT I, 2017, 10116 : 599 - 609
  • [34] Real-time camera pose and focal length estimation
    Jain, Sumit
    Neumann, Ulrich
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 551 - +
  • [35] Real-Time Pose Estimation Piggybacked on Object Detection
    Juranek, Roman
    Herout, Adam
    Dubska, Marketa
    Zemcik, Pavel
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2381 - 2389
  • [36] Tensorpose: Real-time pose estimation for interactive applications
    Schirmer Silva, Luiz Jose
    Soares da Silva, Djalma Lucio
    Raposo, Alberto Barbosa
    Velho, Luiz
    Vieira Lopes, Helio Cortes
    COMPUTERS & GRAPHICS-UK, 2019, 85 : 1 - 14
  • [37] AI-empowered Pose Reconstruction for Real-time Synthesis of Remote Metaverse Avatars
    Gu, Xingci
    Yuan, Ye
    Yang, Jianjun
    Li, Longjiang
    2024 21ST INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING, JCSSE 2024, 2024, : 86 - 93
  • [38] Real-time Kinematic Doppler Pose Estimation for IMES
    Sakamoto, Yoshihiro
    Ebinuma, Takuji
    Fujii, Kenjiro
    Sugano, Shigeki
    2013 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM): MECHATRONICS FOR HUMAN WELLBEING, 2013, : 1300 - 1305
  • [39] Robust Real-Time Extreme Head Pose Estimation
    Tulyakov, Sergey
    Vieriu, Radu-Laurentiu
    Semeniuta, Stanislau
    Sebe, Nicu
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2263 - 2268
  • [40] Poor Man's Virtual Camera: Real-Time Simultaneous Matting and Camera Pose Estimation
    Szentandrasi, Istvan
    Dubska, Marketa
    Zacharias, Michal
    Herout, Adam
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2019, 39 (06) : 108 - 119