Commonsense Knowledge Aware Concept Selection for Diverse and Informative Visual Storytelling

被引:0
|
作者
Chen, Hong [1 ,3 ]
Huang, Yifei [1 ]
Takamura, Hiroya [2 ,3 ]
Nakayama, Hideki [1 ,3 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Tokyo Inst Technol, Tokyo, Japan
[3] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual storytelling is a task of generating relevant and interesting stories for given image sequences. In this work we aim at increasing the diversity of the generated stories while preserving the informative content from the images. We propose to foster the diversity and informativeness of a generated story by using a concept selection module that suggests a set of concept candidates. Then, we utilize a large scale pretrained model to convert concepts and images into full stories. To enrich the candidate concepts, a commonsense knowledge graph is created for each image sequence from which the concept candidates are proposed. To obtain appropriate concepts from the graph, we propose two novel modules that consider the correlation among candidate concepts and the image-concept correlation. Extensive automatic and human evaluation results demonstrate that our model can produce reasonable concepts. This enables our model to outperform the previous models by a large margin on the diversity and informativeness of the story, while retaining the relevance of the story to the image sequence.
引用
收藏
页码:999 / 1008
页数:10
相关论文
共 50 条
  • [21] Optimizing Video Selection LIMIT Queries With Commonsense Knowledge
    He, Wenjia
    Sabek, Ibrahim
    Lou, Yuze
    Cafarella, Michael
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (07): : 1751 - 1764
  • [22] Generating Rational Commonsense Knowledge-Aware Dialogue Responses With Channel-Aware Knowledge Fusing Network
    Wu, Sixing
    Li, Ying
    Zhang, Dawei
    Wu, Zhonghai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 3230 - 3239
  • [23] Knowledge-aware adaptive graph network for commonsense question answering
    Kang, Long
    Li, Xiaoge
    An, Xiaochun
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, 62 (05) : 1305 - 1324
  • [24] Implicit Premise Generation with Discourse-aware Commonsense Knowledge Models
    Chakrabarty, Tuhin
    Trivedi, Aadit
    Muresan, Smaranda
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6247 - 6252
  • [25] Iterative Visual Relationship Detection via Commonsense Knowledge Graph
    Wan, Hai
    Liang, Jinrui
    Du, Jianfeng
    Liu, Yanan
    Ou, Jialing
    Wang, Baoyi
    Pan, Jeff Z.
    Zeng, Juan
    BIG DATA RESEARCH, 2021, 23
  • [26] Exploiting Commonsense Knowledge about Objects for Visual Activity Recognition
    Jiang, Tianyu
    Riloff, Ellen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7277 - 7285
  • [27] Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog
    Zhang, Shunyu
    Jiang, Xiaoze
    Yang, Zequn
    Wan, Tao
    Qin, Zengchang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4599 - 4608
  • [28] Smart Karyotyping Image Selection Based on Commonsense Knowledge Reasoning
    Xu, Yufeng
    Ding, Zhe
    Shi, Lei
    Wang, Juan
    Yu, Linfeng
    Zhang, Haoxi
    Szczerbicki, Edward
    CYBERNETICS AND SYSTEMS, 2024, 55 (03) : 668 - 677
  • [29] Iterative Visual Relationship Detection via Commonsense Knowledge Graph
    Wan, Hai
    Ou, Jialing
    Wang, Baoyi
    Du, Jianfeng
    Pan, Jeff Z.
    Zeng, Juan
    SEMANTIC TECHNOLOGY, JIST 2019: PROCEEDINGS, 2020, 12032 : 210 - 225
  • [30] Multi-Level Knowledge Injecting for Visual Commonsense Reasoning
    Wen, Zhang
    Peng, Yuxin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (03) : 1042 - 1054