Emotion Aware Reinforcement Network for Visual Storytelling

被引：0

作者：

Li, Xin ^{[1
]}

Cai, Hanqing ^{[1
]}

Jiang, Tianling ^{[1
]}

Liu, Chunping ^{[1
]}

Ji, Yi ^{[1
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II | 2022年 / 13530卷

基金：

中国国家自然科学基金;

关键词：

Visual storytelling; Attention mechanism; Reinforcement learning;

D O I：

10.1007/978-3-031-15931-2_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual storytelling is the task of generating a sequence of human-like sentences (i.e. story) for an ordered stream of images. Unlike traditional image captioning, the story contains not only factual descriptions but also concepts and objects that do not explicitly appear in the input images. Recent works utilize either end-to-end or multi-stage frameworks to produce more relevant and coherent stories but usually ignore latent emotional information. In this work, to generate an affective story, we propose an Emotion Aware Reinforcement Network for VIsual StoryTelling (EARN-VIST). Specifically in our network, lexicon-based attention is leveraged to encourage the model to pay more attention to the emotional words. Then we apply two emotional consistency reinforcement learning rewards using an emotion classifier and commonsense transformer respectively to find the gap between generated story and human-labeled story so as to refine the generation process. Experimental results on the VIST dataset and human evaluation demonstrate that our model outperforms most of the cutting-edge models across multiple evaluation metrics.

引用

页码：26 / 37

页数：12

共 50 条

[41] Context-Aware Based Visual-Audio Feature Fusion for Emotion Recognition
Cheng, Huijie
Tie, Yun
Qi, Lin
Jin, Cong
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[42] VISUAL STORYTELLING IN STREET PHOTOGRAPHY
Isik, Atila
ANADOLU UNIVERSITESI SANAT & TASARIM DERGISI-ANADOLU UNIVERSITY JOURNAL OF ART & DESIGN, 2023, 13 (02): : 511 - 525
[43] Visual storytelling studio launched
不详
VETERINARY RECORD, 2024, 194 (11) : 420 - 420
[44] Emotion-Driven Interactive Digital Storytelling
Zhao, Huiwen
Zhang, Jian J.
McDougall, Sine
ENTERTAINMENT COMPUTING - ICEC 2011, 2011, 6972 : 22 - +
[45] A multichannel location-aware interaction network for visual classification
Zhu, Qiangxi
Li, Zhixin
Kuang, Wenlan
Ma, Huifang
APPLIED INTELLIGENCE, 2023, 53 (20) : 23049 - 23066
[46] Distractor-Aware Visual Tracking by Online Siamese Network
Zha, Yufei
Wu, Min
Qiu, Zhuling
Dong, Shuangyu
Yang, Fei
Zhang, Peng
IEEE ACCESS, 2019, 7 : 89777 - 89788
[47] Stratified Rule-Aware Network for Abstract Visual Reasoning
Hu, Sheng
Ma, Yuqing
Liu, Xianglong
Wei, Yanlu
Bai, Shihao
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1567 - 1574
[48] Multi-aware coreference relation network for visual dialog
Zefan Zhang
Tianling Jiang
Chunping Liu
Yi Ji
International Journal of Multimedia Information Retrieval, 2022, 11 : 567 - 576
[49] SiamDA: distribution-aware Siamese network for visual tracking
Ji, Qiuhan
Shi, Hongbo
Tan, Shuai
Song, Bing
Tao, Yang
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
[50] A multichannel location-aware interaction network for visual classification
Qiangxi Zhu
Zhixin Li
Wenlan Kuang
Huifang Ma
Applied Intelligence, 2023, 53 : 23049 - 23066

← 1 2 3 4 5 →