Commonsense Knowledge Aware Concept Selection for Diverse and Informative Visual Storytelling

被引:0
|
作者
Chen, Hong [1 ,3 ]
Huang, Yifei [1 ]
Takamura, Hiroya [2 ,3 ]
Nakayama, Hideki [1 ,3 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Tokyo Inst Technol, Tokyo, Japan
[3] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual storytelling is a task of generating relevant and interesting stories for given image sequences. In this work we aim at increasing the diversity of the generated stories while preserving the informative content from the images. We propose to foster the diversity and informativeness of a generated story by using a concept selection module that suggests a set of concept candidates. Then, we utilize a large scale pretrained model to convert concepts and images into full stories. To enrich the candidate concepts, a commonsense knowledge graph is created for each image sequence from which the concept candidates are proposed. To obtain appropriate concepts from the graph, we propose two novel modules that consider the correlation among candidate concepts and the image-concept correlation. Extensive automatic and human evaluation results demonstrate that our model can produce reasonable concepts. This enables our model to outperform the previous models by a large margin on the diversity and informativeness of the story, while retaining the relevance of the story to the image sequence.
引用
收藏
页码:999 / 1008
页数:10
相关论文
共 50 条
  • [41] CoreSense: Social Commonsense Knowledge-Aware Context Refinement for Conversational Recommender System
    Yang, Hyeongjun
    Kim, Donghyun
    Park, Gayeon
    Yeom, Kyuhwan
    Lee, Kyong-Ho
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (04) : 1702 - 1713
  • [42] Context-Aware Commonsense Knowledge Graph Reasoning With Path-Guided Explanations
    Pan, Yudai
    Liu, Jun
    Zhao, Tianzhe
    Zhang, Lingling
    Wang, Qianying
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (08) : 3725 - 3738
  • [43] Efficient and self-adaptive rationale knowledge base for visual commonsense reasoning
    Zijie Song
    Zhenzhen Hu
    Richang Hong
    Multimedia Systems, 2023, 29 : 3017 - 3026
  • [44] Counterfactual Visual Dialog: Robust Commonsense Knowledge Learning From Unbiased Training
    Liu, An-An
    Huang, Chenxi
    Xu, Ning
    Tian, Hongshuo
    Liu, Jing
    Zhang, Yongdong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1639 - 1651
  • [45] Efficient and self-adaptive rationale knowledge base for visual commonsense reasoning
    Song, Zijie
    Hu, Zhenzhen
    Hong, Richang
    MULTIMEDIA SYSTEMS, 2023, 29 (05) : 3017 - 3026
  • [46] Vision-Language-Knowledge Co-Embedding for Visual Commonsense Reasoning
    Lee, JaeYun
    Kim, Incheol
    SENSORS, 2021, 21 (09)
  • [47] Context-Specific Selection of Commonsense Knowledge Using Large Language Models
    Jakobs, Oliver
    Schon, Claudia
    KI 2024: ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2024, 2024, 14992 : 218 - 231
  • [48] KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
    Xing, Yiran
    Shi, Zai
    Meng, Zhao
    Lakemeyer, Gerhard
    Ma, Yunpu
    Wattenhofer, Roger
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 525 - 535
  • [49] TopicKA: Generating Commonsense Knowledge-Aware Dialogue Responses Towards the Recommended Topic Fact
    Wu, Sixing
    Li, Ying
    Zhang, Dawei
    Zhou, Yang
    Wu, Zhonghai
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3766 - 3772
  • [50] Triple confidence-aware encoder-decoder model for commonsense knowledge graph completion
    Chen, Hongzhi
    Zhang, Fu
    Li, Qinghui
    Li, Xiang
    Ding, Yifan
    Zhang, Daqing
    Cheng, Jingwei
    Wang, Xing
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (03) : 2073 - 2091