Emotion Aware Reinforcement Network for Visual Storytelling

被引：0

作者：

Li, Xin ^{[1
]}

Cai, Hanqing ^{[1
]}

Jiang, Tianling ^{[1
]}

Liu, Chunping ^{[1
]}

Ji, Yi ^{[1
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II | 2022年 / 13530卷

基金：

中国国家自然科学基金;

关键词：

Visual storytelling; Attention mechanism; Reinforcement learning;

D O I：

10.1007/978-3-031-15931-2_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual storytelling is the task of generating a sequence of human-like sentences (i.e. story) for an ordered stream of images. Unlike traditional image captioning, the story contains not only factual descriptions but also concepts and objects that do not explicitly appear in the input images. Recent works utilize either end-to-end or multi-stage frameworks to produce more relevant and coherent stories but usually ignore latent emotional information. In this work, to generate an affective story, we propose an Emotion Aware Reinforcement Network for VIsual StoryTelling (EARN-VIST). Specifically in our network, lexicon-based attention is leveraged to encourage the model to pay more attention to the emotional words. Then we apply two emotional consistency reinforcement learning rewards using an emotion classifier and commonsense transformer respectively to find the gap between generated story and human-labeled story so as to refine the generation process. Experimental results on the VIST dataset and human evaluation demonstrate that our model outperforms most of the cutting-edge models across multiple evaluation metrics.

引用

页码：26 / 37

页数：12

共 50 条

[21] Emotion and Narrative: Perspectives in Autobiographical Storytelling
Makela, Petra
EMOTIONS AND SOCIETY, 2020, 2 (01): : 109 - 111
[22] Changing emotion: The use of therapeutic storytelling
Parker, TS
Wampler, KS
JOURNAL OF MARITAL AND FAMILY THERAPY, 2006, 32 (02) : 155 - 166
[23] Emotion and narrative: Perspectives in autobiographical storytelling
Randall, William
BRITISH JOURNAL OF PSYCHOLOGY, 2020, 111 (01) : 152 - 154
[24] S2-aware network for visual recognition
Zhao, Wenyi
Yang, Huihua
Pan, Xipeng
Li, Lingqiao
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 99
[25] SANet: Structure-Aware Network for Visual Tracking
Fan, Heng
Ling, Haibin
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 2217 - 2224
[26] Beyond visual cues: Emotion recognition in images with text-aware fusion☆
Sungur, Kerim Serdar
Bakal, Gokhan
DISPLAYS, 2025, 87
[27] Context-Aware Attention Network for Human Emotion Recognition in Video
Liu, Xiaodong
Wang, Miao
ADVANCES IN MULTIMEDIA, 2020, 2020
[28] Sparse temporal aware capsule network for robust speech emotion recognition
Zhang, Huiyun
Huang, Heming
Zhao, Puyang
Yu, Zhenbao
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
[29] Sequential Interactive Biased Network for Context-Aware Emotion Recognition
Li, Xinpeng
Peng, Xiaojiang
Ding, Changxing
2021 INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB 2021), 2021,
[30] Artificial emotion model based on reinforcement learning mechanism of neural network
SHI Xue-fei1
The Journal of China Universities of Posts and Telecommunications, 2011, 18 (03) : 105 - 109

← 1 2 3 4 5 →