Emotion Reinforced Visual Storytelling

被引:17
|
作者
Li, Nanxing [1 ,2 ]
Liu, Bei [3 ]
Han, Zhizhong [4 ]
Liu, Yu-Shen [1 ,2 ]
Fu, Jianlong [3 ]
机构
[1] Tsinghua Univ, Sch Software, Beijing, Peoples R China
[2] Beijing Natl Res Ctr Informat Sci & Technol BNRis, Beijing, Peoples R China
[3] Microsoft Res Asia, Beijing, Peoples R China
[4] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Storytelling; Multi-Modal; Emotion; Reinforcement Learning;
D O I
10.1145/3323873.3325050
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automatic story generation from a sequence of images, i.e., visual storytelling, has attracted extensive attention. The challenges mainly drive from modeling rich visually-inspired human emotions, which results in generating diverse yet realistic stories even from the same sequence of images. Existing works usually adopt sequence-based generative adversarial networks (GAN) by encoding deterministic image content (e.g., concept, attribute), while neglecting probabilistic inference from an image over emotion space. In this paper, we take one step further to create human-level stories by modeling image content with emotions, and generating textual paragraph via emotion reinforced adversarial learning. Firstly, we introduce the concept of emotion engaged in visual storytelling. The emotion feature is a representation of the emotional content of the generated story, which enables our model to capture human emotion. Secondly, stories are generated by recurrent neural network, and further optimized by emotion reinforced adversarial learning with three critics, in which visual relevance, language style, and emotion consistency can be ensured. Our model is able to generate stories based on not only emotions generated by our novel emotion generator, but also customized emotions. The introduction of emotion brings more variety and realistic to visual storytelling. We evaluate the proposed model on the largest visual storytelling dataset (VIST). The superior performance to state-of-the-art methods are shown with extensive experiments.
引用
收藏
页码:297 / 305
页数:9
相关论文
共 50 条
  • [31] Plot and Rework: Modeling Storylines for Visual Storytelling
    Hsu, Chi-Yang
    Chu, Yun-Wei
    Huang, Ting-Hao
    Ku, Lun-Wei
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4443 - 4453
  • [32] CitySensing: Fusing City Data for Visual Storytelling
    Balduini, Marco
    Della Valle, Emanuele
    Ciuccarelli, Paolo
    Azzi, Matteo
    Larcher, Roberto
    Antonelli, Fabrizio
    [J]. IEEE MULTIMEDIA, 2015, 22 (03) : 44 - 53
  • [33] Sensing Gender by Coupling Visual and Verbal Storytelling
    Quinlan, Margaret M.
    Ruhl, Stephanie M.
    Torrens, Amanda
    Harter, Lynn M.
    [J]. COMMUNICATION TEACHER, 2013, 27 (01) : 45 - 49
  • [34] Visual Storytelling and Narrative Experiences in Extended Reality
    Szita, Kata
    Lo, Cheng Hung
    [J]. PRESENCE-VIRTUAL AND AUGMENTED REALITY, 2021, 30 : 1 - 4
  • [35] Visual Development Processes for a Multicultural Storytelling Tool
    Sa, Gabriela
    Menegazzi, Douglas
    Caruso, Ana Paula
    Sylla, Cristina
    [J]. 7TH INTERNATIONAL CONFERENCE ON ILLUSTRATION & ANIMATION (CONFIA 2019), 2019, : 348 - 358
  • [36] Multimodal digital storytelling Integrating information, emotion and social cognition
    Alonso, Isabel
    Molina, Silvia
    Dolores Porto, Maria
    [J]. REVIEW OF COGNITIVE LINGUISTICS, 2013, 11 (02): : 369 - 387
  • [37] CHILDREN'S PICTUREBOOKS: The Art of Visual Storytelling
    Heller, Steven
    [J]. NEW YORK TIMES BOOK REVIEW, 2012, : 16 - 17
  • [38] Children's Picturebooks: The Art of Visual Storytelling
    Halliday, Heather
    [J]. LIBRARY JOURNAL, 2012, 137 (04) : 96 - 96
  • [39] Aesop: A Visual Storytelling Platform for Conversational AI
    Meo, Tim
    Raghavan, Aswin
    Salter, David A.
    Tozzo, Alex
    Tamrakar, Amir
    Amer, Mohamed R.
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5844 - 5846
  • [40] Transitional Adaptation of Pretrained Models for Visual Storytelling
    Yu, Youngjae
    Chung, Jiwan
    Yun, Heeseung
    Kim, Jongseok
    Kim, Gunhee
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12653 - 12663