Open Domain Dialogue Generation with Latent Images

被引:0
|
作者
Yang, Ze [1 ]
Wu, Wei [2 ]
Hu, Huang [3 ]
Xu, Can [3 ]
Wang, Wei [4 ]
Li, Zhoujun [1 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing, Peoples R China
[2] Meituan, Beijing, Peoples R China
[3] Microsoft, Beijing, Peoples R China
[4] China Resources Grp, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider grounding open domain dialogues with images. Existing work assumes that both an image and a textual context are available, but image-grounded dialogues by nature are more difficult to obtain than textual dialogues. Thus, we propose learning a response generation model with both image-grounded dialogues and textual dialogues by assuming that the visual scene information at the time of a conversation can be represented by an image, and trying to recover the latent images of the textual dialogues through text-to-image generation techniques. The likelihood of the two types of dialogues is then formulated by a response generator and an image reconstructor that are learned within a conditional variational auto-encoding framework. Empirical studies are conducted in both image-grounded conversation and text-based conversation. In the first scenario, image-grounded dialogues, especially under a low-resource setting, can be effectively augmented by textual dialogues with latent images; while in the second scenario, latent images can enrich the content of responses and at the same time keep them relevant to contexts.
引用
下载
收藏
页码:14239 / 14247
页数:9
相关论文
共 50 条
  • [31] More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge
    Wu, Sixing
    Li, Ying
    Wang, Minghui
    Zhang, Dawei
    Zhou, Yang
    Wu, Zhonghai
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 2286 - 2300
  • [32] Investigating the Representation of Open Domain Dialogue Context for Transformer Models
    Padmakumar, Vishakh
    Hedayatnia, Behnam
    Jin, Di
    Lange, Patrick
    Kim, Seokhwan
    Peng, Nanyun
    Liu, Yang
    Hakkani-Tur, Dilek
    24TH MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE, SIGDIAL 2023, 2023, : 538 - 547
  • [33] Profile Consistency Identification for Open-domain Dialogue Agents
    Song, Haoyu
    Yan Wang
    Zhang, Wei Nan
    Zhao, Zhengyu
    Ting Liu
    Xiaojiang Liu
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6651 - 6662
  • [34] Towards Multilingual Automatic Open-Domain Dialogue Evaluation
    Mendonca, John
    Lavie, Alon
    Trancoso, Isabel
    24TH MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE, SIGDIAL 2023, 2023, : 130 - 141
  • [35] TRGM: Generating Informative Responses for Open Domain Dialogue Systems
    Gao, Wang
    Deng, Hongtao
    Zhu, Xun
    Wang, Yuwei
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2022, 38 (02) : 429 - 444
  • [36] SHONGLAP: A Large Bengali Open-Domain Dialogue Corpus
    Monsur, Syed Mostofa
    Chowdhury, Sakib
    Fatemi, Md Shahrar
    Ahmed, Shafayat
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5797 - 5804
  • [37] MODE: a multimodal open-domain dialogue dataset with explanation
    Yin, Hang
    Lu, Pinren
    Li, Ziang
    Sun, Bin
    Li, Kan
    APPLIED INTELLIGENCE, 2024, 54 (07) : 5891 - 5906
  • [38] Estimating User Interest from Open-Domain Dialogue
    Inaba, Michimasa
    Takahash, Kenichi
    19TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2018), 2018, : 32 - 40
  • [39] Open Domain Event Text Generation
    Fu, Zihao
    Bing, Lidong
    Lam, Wai
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7748 - 7755
  • [40] Multi-Language Hypotheses Ranking And Domain Tracking for Open Domain Dialogue Systems
    Crook, Paul A.
    Robichaud, Jean-Philippe
    Sarikaya, Ruhi
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1810 - 1814