Commonsense Knowledge Aware Concept Selection for Diverse and Informative Visual Storytelling

被引:0
|
作者
Chen, Hong [1 ,3 ]
Huang, Yifei [1 ]
Takamura, Hiroya [2 ,3 ]
Nakayama, Hideki [1 ,3 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Tokyo Inst Technol, Tokyo, Japan
[3] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual storytelling is a task of generating relevant and interesting stories for given image sequences. In this work we aim at increasing the diversity of the generated stories while preserving the informative content from the images. We propose to foster the diversity and informativeness of a generated story by using a concept selection module that suggests a set of concept candidates. Then, we utilize a large scale pretrained model to convert concepts and images into full stories. To enrich the candidate concepts, a commonsense knowledge graph is created for each image sequence from which the concept candidates are proposed. To obtain appropriate concepts from the graph, we propose two novel modules that consider the correlation among candidate concepts and the image-concept correlation. Extensive automatic and human evaluation results demonstrate that our model can produce reasonable concepts. This enables our model to outperform the previous models by a large margin on the diversity and informativeness of the story, while retaining the relevance of the story to the image sequence.
引用
收藏
页码:999 / 1008
页数:10
相关论文
共 50 条
  • [31] PAINE Demo: Optimizing Video Selection Queries With Commonsense Knowledge
    He, Wenjia
    Sabek, Ibrahim
    Lou, Yuze
    Cafarella, Michael
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 3902 - 3905
  • [32] Building a Concept-Level Sentiment Dictionary Based on Commonsense Knowledge
    Tsai, Angela Charng-Rurng
    Wu, Chi-En
    Tsai, Richard Tzong-Han
    Hsu, Jane Yung-jen
    IEEE INTELLIGENT SYSTEMS, 2013, 28 (02) : 22 - 30
  • [33] Visual storytelling enhances knowledge dissemination in biomedical science
    Botsis, Taxiarchis
    Fairman, Jennifer E.
    Moran, Meghan Bridgid
    Anagnostou, Valsamo
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 107
  • [34] A Concept of Visual Knowledge Representation
    Jaworska, Tatiana
    MULTIMEDIA AND NETWORK INFORMATION SYSTEMS, 2019, 833 : 13 - 22
  • [35] Gaussian Distribution-Aware Commonsense Knowledge Learning for Scene Graph Generation
    Tian, Hongshuo
    Xu, Ning
    Kankanhalli, Mohan
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13044 - 13057
  • [36] Towards Confidence-Aware Commonsense Knowledge Integration for Scene Graph Generation
    Tian, Hongshuo
    Xu, Ning
    Wang, Yanhui
    Yan, Chenggang
    Zheng, Bolun
    Li, Xuanya
    Liu, An-An
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2255 - 2260
  • [37] VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
    Ravi, Sahithya
    Chinchure, Aditya
    Sigal, Leonid
    Liao, Renjie
    Shwartz, Vered
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1155 - 1165
  • [38] EFFICIENT SELECTION OF INFORMATIVE AND DIVERSE TRAINING SAMPLES WITH APPLICATIONS IN SCENE CLASSIFICATION
    Paul, Sujoy
    Bappy, Jawadul H.
    Roy-Chowdhury, Amit K.
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 494 - 498
  • [39] KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for visual commonsense reasoning®
    Song, Dandan
    Ma, Siyi
    Sun, Zhanchen
    Yang, Sicheng
    Liao, Lejian
    KNOWLEDGE-BASED SYSTEMS, 2021, 230
  • [40] Robust visual tracking via online informative feature selection
    Song, Huihui
    ELECTRONICS LETTERS, 2014, 50 (25) : 1931 - 1932