Image-Collection Summarization Using Scene-Graph Generation With External Knowledge

被引:0
|
作者
Phueaksri, Itthisak [1 ,2 ]
Kastner, Marc A. [3 ]
Kawanishi, Yasutomo [1 ,2 ]
Komamizu, Takahiro [1 ,4 ]
Ide, Ichiro [1 ,4 ]
机构
[1] Nagoya Univ, Grad Sch Informat, Nagoya, Aichi 4648601, Japan
[2] RIKEN, Informat Res & Dev & Strategy Headquarters, Guardian Robot Project, Kyoto 6190288, Japan
[3] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan
[4] Nagoya Univ, Math & Data Sci Ctr, Nagoya, Aichi 4648601, Japan
关键词
Object detection; Knowledge graphs; Semantics; Visualization; Image analysis; Market research; Image collection summarization; multiple-image summarization; semantic images summarization; scene-graph generation; scene-graph summarization; SIMILARITY; LANGUAGE;
D O I
10.1109/ACCESS.2024.3360113
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Summarization tasks aim to summarize multiple pieces of information into a short description or representative information. A text summarization task summarizes textual information into a short description, whereas an image collection summarization task summarizes an image collection into images or textual representation in which the challenge is to understand the relationship between images. In recent years, scene-graph generation has shown the advantage of describing the visual contexts of a single-image, and incorporating external knowledge into the scene-graph generation model has also given effective directions for unseen single-image scene-graph generation. While external knowledge has been implemented in related work, it is still challenging to use this information efficiently for relationship estimation during the summarization. Following this trend, in this paper, we propose a novel scene-graph-based image-collection summarization model that aims to generate a summarized scene-graph of an image collection. The key idea of the proposed method is to enhance the relation predictor toward relationships between images in an image collection incorporating knowledge graphs as external knowledge for training a model. With this approach, we build an end-to-end framework that can generate a summarized scene graph of an image collection. To evaluate the proposed method, we also build an extended annotated MS-COCO dataset for this task and introduce an evaluation process that focuses on estimating the similarity between a summarized scene graph and ground-truth scene graphs. Traditional evaluation focuses on calculating precision and recall scores, which involve true positive predictions without balancing precision and recall. Meanwhile, the proposed evaluation process focuses on calculating the F-score of the similarity between a summarized scene graph and ground-truth scene graphs, which aims to balance both false positives and false negatives. Experimental results show that using external knowledge to enhance the relation predictor achieves better results than existing methods.
引用
收藏
页码:17499 / 17512
页数:14
相关论文
共 50 条
  • [41] FashionGraph: understanding fashion data using scene graph generation
    Sadegharmaki, Shabnam
    Kastner, Marc A.
    Satoh, Shin'ichi
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7923 - 7929
  • [42] External Knowledge Enhanced 3D Scene Generation from Sketch
    Wu, Zijie
    Feng, Mingtao
    Wang, Yaonan
    Xie, He
    Dong, Weisheng
    Miao, Bo
    Mian, Ajmal
    COMPUTER VISION - ECCV 2024, PT VI, 2025, 15064 : 286 - 304
  • [43] Image Generation from Hyper Scene Graph with Multiple Types of Trinomial Hyperedges
    Miyake R.
    Matsukawa T.
    Suzuki E.
    SN Computer Science, 5 (5)
  • [44] Improving rare relation inferring for scene graph generation using bipartite graph network
    Lu, Jiale
    Chen, Lianggangxu
    Guan, Haoyue
    Lin, Shaohui
    Gu, Chunhua
    Wang, Changbo
    He, Gaoqi
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 239
  • [45] Multimodal image collection summarization using non-negative matrix factorization
    Camargo, Jorge E.
    Gonzalez, Fabio A.
    2011 6TH COLOMBIAN COMPUTING CONGRESS (CCC), 2011,
  • [46] Knowledge-Enhanced Scene Graph Generation with Multimodal Relation Alignment (Student Abstract)
    Fu, Ze
    Feng, Junhao
    Zheng, Changmeng
    Cai, Yi
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12947 - 12948
  • [47] More Knowledge, Less Bias: Unbiasing Scene Graph Generation with Explicit Ontological Adjustment
    Chen, Zhanwen
    Rezayi, Saed
    Li, Sheng
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4012 - 4021
  • [48] Image Caption Generation using Deep Learning For Video Summarization Applications
    Inayathulla, Mohammed
    Karthikeyan, C.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 565 - 572
  • [49] Image segmentation using Graph Cut technique for outdoor Scene Images
    Bhosale, Purnashti
    Gokhale, Aniket
    Motey, Yogesh
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2013, : 280 - 282
  • [50] Scene graph fusion and negative sample generation strategy for image-text matching
    Wang, Liqin
    Yang, Pengcheng
    Wang, Xu
    Xu, Zhihong
    Dong, Yongfeng
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):