Leveraging the Latent Diffusion Models for Offline Facial Multiple Appropriate Reactions Generation

被引:4
|
作者
Yu, Jun [1 ]
Zhao, Ji [1 ]
Xie, Guochen [1 ]
Chen, Fengxin [2 ]
Yu, Ye [2 ]
Peng, Liang [3 ]
Li, Minglei [3 ]
Dai, Zonghong [3 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Hefei Univ Technol, Hefei, Peoples R China
[3] Huawei Cloud, Shenzhen, Peoples R China
关键词
Listener Reaction; Offline Reaction Generation; Diffusion Model;
D O I
10.1145/3581783.3612865
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Offline Multiple Appropriate Facial Reaction Generation (OMAFRG) aims to predict the reaction of different listeners given a speaker, which is useful in the senario of human-computer interaction and social media analysis. In recent years, the Offline Facial Reactions Generation (OFRG) task has been explored in different ways. However, most studies only focus on the deterministic reaction of the listeners. The research of the non-deterministic (i.e. OMAFRG) always lacks of sufficient attention and the results are far from satisfactory. Compared with the deterministic OFRG tasks, the OMAFRG task is closer to the true circumstance but corresponds to higher difficulty for its requirement of modeling stochasticity and context. In this paper, we propose a new model named FRDiff to tackle this issue. Our model is developed based on the diffusion model architecture with some modification to enhance its ability of aggregating the context features. And the inherent property of stochasticity in diffusion model enables our model to generate multiple reactions. We conduct experiments on the datasets provided by the ACM Multimedia REACT2023 and obtain the second place on the board, which demonstrates the effectiveness of our method.
引用
收藏
页码:9561 / 9565
页数:5
相关论文
共 27 条
  • [21] Expressive 3D Facial Animation Generation Based on Local-to-Global Latent Diffusion
    Song, Wenfeng
    Wang, Xuan
    Jiang, Yiming
    Li, Shuai
    Hao, Aimin
    Hou, Xia
    Qin, Hong
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (11) : 7397 - 7407
  • [22] Short-Term Wind Power Scenario Generation Based on Conditional Latent Diffusion Models
    Dong, Xiaochong
    Mao, Zhihang
    Sun, Yingyun
    Xu, Xinzhi
    IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2024, 15 (02) : 1074 - 1085
  • [23] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
    Zhang, David Junhao
    Wu, Jay Zhangjie
    Liu, Jia-Wei
    Zhao, Rui
    Ran, Lingmin
    Gu, Yuchao
    Gao, Difei
    Shou, Mike Zheng
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (04) : 1879 - 1893
  • [24] Multiple Facial Reaction Generation using Gaussian Mixture of Models and Multimodal Bottleneck Transformer
    Nguyen, Dang-Khanh
    Paudel, Prabesh
    Kim, Seung-Won
    Shin, Ji-Eun
    Kim, Soo-Hyung
    Yang, Hyung-Jeong
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [25] Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis
    Zhu, Lingting
    Xue, Zeyue
    Jin, Zhenchao
    Liu, Xian
    He, Jingzhen
    Liu, Ziwei
    Yu, Lequan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT X, 2023, 14229 : 592 - 601
  • [26] GENERATION OF ANONYMOUS CHEST RADIOGRAPHS USING LATENT DIFFUSION MODELS FOR TRAINING THORACIC ABNORMALITY CLASSIFICATION SYSTEMS
    Packhaeuser, Kai
    Folle, Lukas
    Thamm, Florian
    Maier, Andreas
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [27] DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
    Sun, Zhiyao
    Lv, Tian
    Ye, Sheng
    Lin, Matthieu
    Sheng, Jenny
    Wen, Yu-Hui
    Yu, Minjing
    Liu, Yong-Jin
    ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):