Vector Quantized Diffusion Models for Multiple Appropriate Reactions Generation

被引:0
|
作者
Nguyen, Minh-Duc [1 ]
Yang, Hyung-Jeong [1 ]
Ho, Ngoc-Huynh [1 ]
Kim, Soo-Hyung [1 ]
Kim, Seungwon [1 ]
Shin, Ji-Eun [1 ]
机构
[1] Chonnam Natl Univ, Gwangju, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/FG59268.2024.10581978
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the realm of dyadic interactions, the ability to generate appropriate facial reactions is paramount for the conveyance of empathy and understanding. This paper introduces a novel framework that leverages the strengths of a diffusion model architecture, underpinned by a vector quantized variational autoencoder (VQ-VAE) to synthesize facial reactions that are contextually apt. We rigorously evaluate our model on the IEEE FG REACT2024 dataset, where it demonstrates superior performance, outshining baseline methods in terms of effectiveness. The results underscore the potential of our framework to enhance the fidelity of digital human interactions, paving the way for more nuanced and emotionally intelligent systems.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] MatFuse: Controllable Material Generation with Diffusion Models
    Vecchio, Giuseppe
    Sortino, Renato
    Palazzo, Simone
    Spampinato, Concetto
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 4429 - 4438
  • [32] Mirror Diffusion Models for Constrained andWatermarked Generation
    Liu, Guan-Horng
    Chen, Tianrong
    Theodorou, Evangelos A.
    Tao, Molei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [33] Conditional Text Image Generation with Diffusion Models
    Zhu, Yuanzhi
    Li, Zhaohai
    Wang, Tianwei
    He, Mengchao
    Yao, Cong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14235 - 14245
  • [34] Compositional Visual Generation with Composable Diffusion Models
    Liu, Nan
    Li, Shuang
    Du, Yilun
    Torralba, Antonio
    Tenenbaum, Joshua B.
    COMPUTER VISION - ECCV 2022, PT XVII, 2022, 13677 : 423 - 439
  • [35] Invisible Watermarking for Audio Generation Diffusion Models
    Cao, Xirong
    Li, Xiang
    Jadav, Divyesh
    Wu, Yanzhao
    Chen, Zhehui
    Zeng, Chen
    Wei, Wenqi
    2023 5TH IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS, TPS-ISA, 2023, : 193 - 202
  • [36] Audio Generation with Multiple Conditional Diffusion Model
    Guo, Zhifang
    Mao, Jianguo
    Tao, Rui
    Yan, Long
    Ouchi, Kazushige
    Liu, Hong
    Wang, Xiangdong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18153 - 18161
  • [37] Evaluating Quantized Llama 2 Models for IoT Privacy Policy Language Generation
    Malisetty, Bhavani
    Perez, Alfredo J.
    FUTURE INTERNET, 2024, 16 (07)
  • [38] Vector Field Oriented Diffusion Model for Crystal Material Generation
    Klipfel, Astrid
    Fregier, Yael
    Sayede, Adlane
    Bouraoui, Zied
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22193 - 22201
  • [39] Lifetime prediction for TBC systems: Generation of appropriate input data for models
    Osgerby, S.
    Nunn, J.W.
    Saunders, S.R.J.
    VTT Symposium (Valtion Teknillinen Tutkimuskeskus), 2004, (233): : 345 - 354
  • [40] Algorithms for Vector Field Generation in Mass Consistent Models
    Flores, Ciro
    Juarez, Hector
    Nunez, Marco A.
    Luisa Sandoval, Maria
    NUMERICAL METHODS FOR PARTIAL DIFFERENTIAL EQUATIONS, 2010, 26 (04) : 826 - 842