EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

被引:1
|
作者
Drobyshev, Nikita [1 ]
Casademunt, Antoni Bigata [1 ]
Vougioukas, Konstantinos [1 ]
Landgraf, Zoe [1 ]
Petridis, Stavros [1 ]
Pantic, Maja [1 ]
机构
[1] Imperial Coll London, London, England
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年
关键词
D O I
10.1109/CVPR52733.2024.00812
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Head avatars animated by visual signals have gained popularity, particularly in cross-driving synthesis where the driver differs from the animated character, a challenging but highly practical approach. The recently presented MegaPortraits model has demonstrated state-of-the-art results in this domain. We conduct a deep examination and evaluation of this model, with a particular focus on its latent space for facial expression descriptors, and uncover several limitations with its ability to express intense face motions. To address these limitations, we propose substantial changes in both training pipeline and model architecture, to introduce our EMOPortraits model, where we: Enhance the model's capability to faithfully support in-tense, asymmetric face expressions, setting a new state-of-the-art result in the emotion transfer task, surpassing previous methods in both metrics and quality. Incorporate speech-driven mode to our model, achieving top-tier performance in audio-driven facial animation, making it possible to drive source identity through diverse modalities, including visual signal, audio, or a blend of both. Furthermore, we propose a novel multi-view video dataset featuring a wide range of intense and asymmetric facial expressions, filling the gap with absence of such data in existing datasets.
引用
收藏
页码:8498 / 8507
页数:10
相关论文
共 41 条
  • [31] One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field
    Li, Weichuang
    Zhang, Longhao
    Wang, Dong
    Zhao, Bin
    Wang, Zhigang
    Chen, Mulin
    Zhang, Bang
    Wang, Zhongjian
    Bo, Liefeng
    Li, Xuelong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 17969 - 17978
  • [32] Skeleton-Based Dynamic Hand Gesture Recognition Using an Enhanced Network with One-Shot Learning
    Ma, Chunyong
    Zhang, Shengsheng
    Wang, Anni
    Qi, Yongyang
    Chen, Ge
    APPLIED SCIENCES-BASEL, 2020, 10 (11):
  • [33] VOODOO 3D: Volumetric Portrait Disentanglement for One-Shot 3D Head Reenactment
    Phong Tran
    Zakharov, Egor
    Long-Nhat Ho
    Anh Tuan Tran
    Hu, Liwen
    Li, Hao
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10336 - 10348
  • [34] Phacoemulsification using a chisel-shaped illuminator: enhanced depth trench, one-shot crack, and phaco cut
    Wi, Jaemin
    Seo, Hyejin
    Lee, Jong Yeon
    Nam, Dong Heun
    EUROPEAN JOURNAL OF OPHTHALMOLOGY, 2016, 26 (03) : 279 - 280
  • [35] CLESSR-VC: Contrastive learning enhanced self-supervised representations for one-shot voice conversion
    Xue, Yuhang
    Chen, Ning
    Luo, Yixin
    Zhu, Hongqing
    Zhu, Zhiying
    SPEECH COMMUNICATION, 2024, 165
  • [36] One-shot learning-based driver's head movement identification using a millimetre-wave radar sensor
    Hong Nhung Nguyen
    Lee, Seongwook
    Tien-Tung Nguyen
    Kim, Yong-Hwa
    IET RADAR SONAR AND NAVIGATION, 2022, 16 (05): : 825 - 836
  • [37] Attention-Enhanced One-Shot Attack against Single Object Tracking for Unmanned Aerial Vehicle Remote Sensing Images
    Jiang, Yan
    Yin, Guisheng
    REMOTE SENSING, 2023, 15 (18)
  • [38] One-Shot In Vitro Evolution Generated an Antibody Fragment for Testing Urinary Cotinine with More Than 40-Fold Enhanced Affinity
    Oyama, Hiroyuki
    Morita, Izumi
    Kiguchi, Yuki
    Banzono, Erika
    Ishii, Kasumi
    Kubo, Satoshi
    Watanabe, Yoshiro
    Hirai, Anna
    Kaede, Chiaki
    Ohta, Mitsuhiro
    Kobayashi, Norihiro
    ANALYTICAL CHEMISTRY, 2017, 89 (01) : 988 - 995
  • [39] High-quality one-shot interactive segmentation for remote sensing images via hybrid adapter-enhanced foundation models
    Zhang, Zhili
    Hu, Xiangyun
    Yang, Yue
    Yang, Bingnan
    Deng, Kai
    Dai, Hengming
    Zhang, Mi
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2025, 139
  • [40] 3-D Facial Priors Guided Local-Global Motion Collaboration Transforms for One-Shot Talking-Head Video Synthesis
    Chen, Yilei
    Zeng, Rui
    Xiong, Shengwu
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 132 - 143