Vector Quantized Diffusion Models for Multiple Appropriate Reactions Generation

被引：0

作者：

Nguyen, Minh-Duc ^{[1
]}

Yang, Hyung-Jeong ^{[1
]}

Ho, Ngoc-Huynh ^{[1
]}

Kim, Soo-Hyung ^{[1
]}

Kim, Seungwon ^{[1
]}

Shin, Ji-Eun ^{[1
]}

机构：

[1] Chonnam Natl Univ, Gwangju, South Korea

来源：

2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024 | 2024年

基金：

新加坡国家研究基金会;

关键词：

D O I：

10.1109/FG59268.2024.10581978

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the realm of dyadic interactions, the ability to generate appropriate facial reactions is paramount for the conveyance of empathy and understanding. This paper introduces a novel framework that leverages the strengths of a diffusion model architecture, underpinned by a vector quantized variational autoencoder (VQ-VAE) to synthesize facial reactions that are contextually apt. We rigorously evaluate our model on the IEEE FG REACT2024 dataset, where it demonstrates superior performance, outshining baseline methods in terms of effectiveness. The results underscore the potential of our framework to enhance the fidelity of digital human interactions, paving the way for more nuanced and emotionally intelligent systems.

引用

页数：5

共 50 条

[31] MatFuse: Controllable Material Generation with Diffusion Models
Vecchio, Giuseppe
Sortino, Renato
Palazzo, Simone
Spampinato, Concetto
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 4429 - 4438
[32] Mirror Diffusion Models for Constrained andWatermarked Generation
Liu, Guan-Horng
Chen, Tianrong
Theodorou, Evangelos A.
Tao, Molei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[33] Conditional Text Image Generation with Diffusion Models
Zhu, Yuanzhi
Li, Zhaohai
Wang, Tianwei
He, Mengchao
Yao, Cong
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14235 - 14245
[34] Compositional Visual Generation with Composable Diffusion Models
Liu, Nan
Li, Shuang
Du, Yilun
Torralba, Antonio
Tenenbaum, Joshua B.
COMPUTER VISION - ECCV 2022, PT XVII, 2022, 13677 : 423 - 439
[35] Invisible Watermarking for Audio Generation Diffusion Models
Cao, Xirong
Li, Xiang
Jadav, Divyesh
Wu, Yanzhao
Chen, Zhehui
Zeng, Chen
Wei, Wenqi
2023 5TH IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS, TPS-ISA, 2023, : 193 - 202
[36] Audio Generation with Multiple Conditional Diffusion Model
Guo, Zhifang
Mao, Jianguo
Tao, Rui
Yan, Long
Ouchi, Kazushige
Liu, Hong
Wang, Xiangdong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18153 - 18161
[37] Evaluating Quantized Llama 2 Models for IoT Privacy Policy Language Generation
Malisetty, Bhavani
Perez, Alfredo J.
FUTURE INTERNET, 2024, 16 (07)
[38] Vector Field Oriented Diffusion Model for Crystal Material Generation
Klipfel, Astrid
Fregier, Yael
Sayede, Adlane
Bouraoui, Zied
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22193 - 22201
[39] Lifetime prediction for TBC systems: Generation of appropriate input data for models
Osgerby, S.
Nunn, J.W.
Saunders, S.R.J.
VTT Symposium (Valtion Teknillinen Tutkimuskeskus), 2004, (233): : 345 - 354
[40] Algorithms for Vector Field Generation in Mass Consistent Models
Flores, Ciro
Juarez, Hector
Nunez, Marco A.
Luisa Sandoval, Maria
NUMERICAL METHODS FOR PARTIAL DIFFERENTIAL EQUATIONS, 2010, 26 (04) : 826 - 842

← 1 2 3 4 5 →