Emotion Aware Reinforcement Network for Visual Storytelling

被引:0
|
作者
Li, Xin [1 ]
Cai, Hanqing [1 ]
Jiang, Tianling [1 ]
Liu, Chunping [1 ]
Ji, Yi [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Visual storytelling; Attention mechanism; Reinforcement learning;
D O I
10.1007/978-3-031-15931-2_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual storytelling is the task of generating a sequence of human-like sentences (i.e. story) for an ordered stream of images. Unlike traditional image captioning, the story contains not only factual descriptions but also concepts and objects that do not explicitly appear in the input images. Recent works utilize either end-to-end or multi-stage frameworks to produce more relevant and coherent stories but usually ignore latent emotional information. In this work, to generate an affective story, we propose an Emotion Aware Reinforcement Network for VIsual StoryTelling (EARN-VIST). Specifically in our network, lexicon-based attention is leveraged to encourage the model to pay more attention to the emotional words. Then we apply two emotional consistency reinforcement learning rewards using an emotion classifier and commonsense transformer respectively to find the gap between generated story and human-labeled story so as to refine the generation process. Experimental results on the VIST dataset and human evaluation demonstrate that our model outperforms most of the cutting-edge models across multiple evaluation metrics.
引用
收藏
页码:26 / 37
页数:12
相关论文
共 50 条
  • [1] Emotion Reinforced Visual Storytelling
    Li, Nanxing
    Liu, Bei
    Han, Zhizhong
    Liu, Yu-Shen
    Fu, Jianlong
    ICMR'19: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2019, : 297 - 305
  • [2] Modeling Protagonist Emotions for Emotion-Aware Storytelling
    Brahman, Faeze
    Chaturvedi, Snigdha
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5277 - 5294
  • [3] Stimuli-Aware Visual Emotion Analysis
    Yang, Jingyuan
    Li, Jie
    Wang, Xiumei
    Ding, Yuxuan
    Gao, Xinbo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7432 - 7445
  • [4] Emotion and storytelling
    Juliá, MP
    ARBOR-CIENCIA PENSAMIENTO Y CULTURA, 2004, 177 (697) : 125 - 156
  • [5] Environments to support context and emotion aware visual interaction
    Fogli, D
    Piccinno, A
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2005, 16 (05): : 386 - 405
  • [6] Commonsense Knowledge Aware Concept Selection for Diverse and Informative Visual Storytelling
    Chen, Hong
    Huang, Yifei
    Takamura, Hiroya
    Nakayama, Hideki
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 999 - 1008
  • [7] Unsupervised Time-Aware Sampling Network With Deep Reinforcement Learning for EEG-Based Emotion Recognition
    Zhang, Yongtao
    Pan, Yue
    Zhang, Yulin
    Zhang, Min
    Li, Linling
    Zhang, Li
    Huang, Gan
    Su, Lei
    Liu, Honghai
    Liang, Zhen
    Zhang, Zhiguo
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (03) : 1090 - 1103
  • [8] GLOCAL CASCADING NETWORK FOR TOPIC ENHANCED VISUAL STORYTELLING
    Su, Jiaqi
    Chen, Weiran
    Ji, Yi
    Liu, Chunping
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 2845 - 2849
  • [9] Mapmaking as visual storytelling: the movement and emotion of managing sex work in the urban landscape
    Jordeno, Sara
    Horning, Amber
    CRIME LAW AND SOCIAL CHANGE, 2024, 81 (05) : 537 - 558
  • [10] Introduction to the special issue on "Context and emotion aware visual computing"
    Bianchi-Berthouze, Nadia
    Mussio, Piero
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2006, 17 (05): : 395 - 397