EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

被引:1
|
作者
Drobyshev, Nikita [1 ]
Casademunt, Antoni Bigata [1 ]
Vougioukas, Konstantinos [1 ]
Landgraf, Zoe [1 ]
Petridis, Stavros [1 ]
Pantic, Maja [1 ]
机构
[1] Imperial Coll London, London, England
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年
关键词
D O I
10.1109/CVPR52733.2024.00812
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Head avatars animated by visual signals have gained popularity, particularly in cross-driving synthesis where the driver differs from the animated character, a challenging but highly practical approach. The recently presented MegaPortraits model has demonstrated state-of-the-art results in this domain. We conduct a deep examination and evaluation of this model, with a particular focus on its latent space for facial expression descriptors, and uncover several limitations with its ability to express intense face motions. To address these limitations, we propose substantial changes in both training pipeline and model architecture, to introduce our EMOPortraits model, where we: Enhance the model's capability to faithfully support in-tense, asymmetric face expressions, setting a new state-of-the-art result in the emotion transfer task, surpassing previous methods in both metrics and quality. Incorporate speech-driven mode to our model, achieving top-tier performance in audio-driven facial animation, making it possible to drive source identity through diverse modalities, including visual signal, audio, or a blend of both. Furthermore, we propose a novel multi-view video dataset featuring a wide range of intense and asymmetric facial expressions, filling the gap with absence of such data in existing datasets.
引用
收藏
页码:8498 / 8507
页数:10
相关论文
共 41 条
  • [21] HIERARCHICAL TEMPORAL MEMORY ENHANCED ONE-SHOT DISTANCE LEARNING FOR ACTION RECOGNITION
    Zou, Yixiong
    Shi, Yemin
    Wang, Yaowei
    Shu, Yu
    Yuan, Qingsheng
    Tian, Yonghong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [22] Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion
    Wang, Suzhen
    Li, Lincheng
    Ding, Yu
    Fan, Changjie
    Yu, Xin
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1098 - 1105
  • [23] An enhanced priority reservation algorithm for ATM multicast switches with a one-shot scheduling scheme
    Kim, HJ
    Sung, DK
    IEICE TRANSACTIONS ON COMMUNICATIONS, 1998, E81B (11) : 2237 - 2241
  • [24] A one-shot domain-independent robust multimedia clustering methodology based on hybrid multimodal fusion
    Xavier Sevillano
    Francesc Alías
    Multimedia Tools and Applications, 2014, 73 : 1507 - 1543
  • [25] SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action Recognition
    Memmesheimer, Raphael
    Theisen, Nick
    Paulus, Dietrich
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4573 - 4580
  • [26] A one-shot domain-independent robust multimedia clustering methodology based on hybrid multimodal fusion
    Sevillano, Xavier
    Alias, Francesc
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 73 (03) : 1507 - 1543
  • [27] One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing
    Wang, Ting-Chun
    Mallya, Arun
    Liu, Ming-Yu
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10034 - 10044
  • [28] One-shot dual-projection topography enhanced by phase-shifting logical moire
    Jiang, Jiajia
    Guo, Hongwei
    APPLIED OPTICS, 2021, 60 (19) : 5507 - 5516
  • [29] Efficiency-enhanced Progressive Sampling Method on One-shot Person Re-Identification
    Zhao, Jing
    Tang, Yuhua
    Yang, Mingliang
    Huang, Wanrong
    Yang, Qiong
    2020 IEEE INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING AND ROBOTICS (IEEE-RCAR 2020), 2020, : 375 - 380
  • [30] ETMO-NAS: An efficient two-step multimodal one-shot NAS for lung nodules classification
    Yu, Jiandong
    Li, Tongtong
    Shi, Xuerong
    Zhao, Ziyang
    Chen, Miao
    Zhang, Yu
    Wang, Junyu
    Yao, Zhijun
    Fang, Lei
    Hu, Bin
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 104