Emotion Aware Reinforcement Network for Visual Storytelling

被引:0
|
作者
Li, Xin [1 ]
Cai, Hanqing [1 ]
Jiang, Tianling [1 ]
Liu, Chunping [1 ]
Ji, Yi [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Visual storytelling; Attention mechanism; Reinforcement learning;
D O I
10.1007/978-3-031-15931-2_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual storytelling is the task of generating a sequence of human-like sentences (i.e. story) for an ordered stream of images. Unlike traditional image captioning, the story contains not only factual descriptions but also concepts and objects that do not explicitly appear in the input images. Recent works utilize either end-to-end or multi-stage frameworks to produce more relevant and coherent stories but usually ignore latent emotional information. In this work, to generate an affective story, we propose an Emotion Aware Reinforcement Network for VIsual StoryTelling (EARN-VIST). Specifically in our network, lexicon-based attention is leveraged to encourage the model to pay more attention to the emotional words. Then we apply two emotional consistency reinforcement learning rewards using an emotion classifier and commonsense transformer respectively to find the gap between generated story and human-labeled story so as to refine the generation process. Experimental results on the VIST dataset and human evaluation demonstrate that our model outperforms most of the cutting-edge models across multiple evaluation metrics.
引用
收藏
页码:26 / 37
页数:12
相关论文
共 50 条
  • [31] Artificial emotion model based on reinforcement learning mechanism of neural network
    Shi, Xue-Fei
    Wang, Zhi-Liang
    Ping, An
    Zhang, Li-Kun
    Journal of China Universities of Posts and Telecommunications, 2011, 18 (03): : 105 - 109
  • [32] Textual-Visual Reference-Aware Attention Network for Visual Dialog
    School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China
    不详
    IEEE Trans Image Process, 2020, (6655-6666):
  • [33] Textual-Visual Reference-Aware Attention Network for Visual Dialog
    Guo, Dan
    Wang, Hui
    Wang, Shuhui
    Wang, Meng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 6655 - 6666
  • [34] A Visual Sensitivity Aware ABR Algorithm for DASH via Deep Reinforcement Learning
    Ye, Jin
    Dan, Meng
    Jiang, Wenchao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (03)
  • [35] Reinforcement Learning Assisted Bandwidth Aware Virtual Network Resource Allocation
    Zhang, Peiying
    Su, Yu
    Wang, Jingjing
    Jiang, Chunxiao
    Hsu, Ching-Hsien
    Shen, Shigen
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2022, 19 (04): : 4111 - 4123
  • [36] Progressive Visual Content Understanding Network for Image Emotion Classification
    Pan, Jicai
    Wang, Shangfei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6034 - 6044
  • [37] Visual Storytelling: Inspiring a New Visual Language
    Riccomini, Donald R.
    TECHNICAL COMMUNICATION, 2012, 59 (04) : 341 - 341
  • [38] Reinforcement, emotion, and consciousness
    Izard, C
    BEHAVIORAL AND BRAIN SCIENCES, 2000, 23 (02) : 202 - +
  • [39] Visual Storytelling of Development Sessions
    Minelli, Roberto
    Baracchi, Lorenzo
    Mocci, Andrea
    Lanza, Michele
    2014 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2014, : 416 - 420
  • [40] A Visual Data Storytelling Framework
    Zhang, Yangjinbo
    Reynolds, Mark
    Lugmayr, Artur
    Damjanov, Katarina
    Hassan, Ghulam Mubashar
    INFORMATICS-BASEL, 2022, 9 (04):