Leveraging the Latent Diffusion Models for Offline Facial Multiple Appropriate Reactions Generation

被引：4

作者：

Yu, Jun ^{[1
]}

Zhao, Ji ^{[1
]}

Xie, Guochen ^{[1
]}

Chen, Fengxin ^{[2
]}

Yu, Ye ^{[2
]}

Peng, Liang ^{[3
]}

Li, Minglei ^{[3
]}

Dai, Zonghong ^{[3
]}

机构：

[1] Univ Sci & Technol China, Hefei, Peoples R China

[2] Hefei Univ Technol, Hefei, Peoples R China

[3] Huawei Cloud, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年

关键词：

Listener Reaction; Offline Reaction Generation; Diffusion Model;

D O I：

10.1145/3581783.3612865

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Offline Multiple Appropriate Facial Reaction Generation (OMAFRG) aims to predict the reaction of different listeners given a speaker, which is useful in the senario of human-computer interaction and social media analysis. In recent years, the Offline Facial Reactions Generation (OFRG) task has been explored in different ways. However, most studies only focus on the deterministic reaction of the listeners. The research of the non-deterministic (i.e. OMAFRG) always lacks of sufficient attention and the results are far from satisfactory. Compared with the deterministic OFRG tasks, the OMAFRG task is closer to the true circumstance but corresponds to higher difficulty for its requirement of modeling stochasticity and context. In this paper, we propose a new model named FRDiff to tackle this issue. Our model is developed based on the diffusion model architecture with some modification to enhance its ability of aggregating the context features. And the inherent property of stochasticity in diffusion model enables our model to generate multiple reactions. We conduct experiments on the datasets provided by the ACM Multimedia REACT2023 and obtain the second place on the board, which demonstrates the effectiveness of our method.

引用

页码：9561 / 9565

页数：5

共 27 条

[21] Expressive 3D Facial Animation Generation Based on Local-to-Global Latent Diffusion
Song, Wenfeng
Wang, Xuan
Jiang, Yiming
Li, Shuai
Hao, Aimin
Hou, Xia
Qin, Hong
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (11) : 7397 - 7407
[22] Short-Term Wind Power Scenario Generation Based on Conditional Latent Diffusion Models
Dong, Xiaochong
Mao, Zhihang
Sun, Yingyun
Xu, Xinzhi
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2024, 15 (02) : 1074 - 1085
[23] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Zhang, David Junhao
Wu, Jay Zhangjie
Liu, Jia-Wei
Zhao, Rui
Ran, Lingmin
Gu, Yuchao
Gao, Difei
Shou, Mike Zheng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (04) : 1879 - 1893
[24] Multiple Facial Reaction Generation using Gaussian Mixture of Models and Multimodal Bottleneck Transformer
Nguyen, Dang-Khanh
Paudel, Prabesh
Kim, Seung-Won
Shin, Ji-Eun
Kim, Soo-Hyung
Yang, Hyung-Jeong
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
[25] Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis
Zhu, Lingting
Xue, Zeyue
Jin, Zhenchao
Liu, Xian
He, Jingzhen
Liu, Ziwei
Yu, Lequan
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT X, 2023, 14229 : 592 - 601
[26] GENERATION OF ANONYMOUS CHEST RADIOGRAPHS USING LATENT DIFFUSION MODELS FOR TRAINING THORACIC ABNORMALITY CLASSIFICATION SYSTEMS
Packhaeuser, Kai
Folle, Lukas
Thamm, Florian
Maier, Andreas
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
[27] DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Sun, Zhiyao
Lv, Tian
Ye, Sheng
Lin, Matthieu
Sheng, Jenny
Wen, Yu-Hui
Yu, Minjing
Liu, Yong-Jin
ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):

← 1 2 3 →