Learning to Relate from Captions and Bounding Boxes

被引:0
|
作者
Garg, Sarthak [1 ]
Moniz, Joel Ruben Antony [1 ]
Aviral, Anshu [1 ]
Bollimpalli, Priyatham [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a novel approach that predicts the relationships between various entities in an image in a weakly supervised manner by relying on image captions and object bounding box annotations as the sole source of supervision. Our proposed approach uses a top-down attention mechanism to align entities in captions to objects in the image, and then leverage the syntactic structure of the captions to align the relations. We use these alignments to train a relation classification network, thereby obtaining both grounded captions and dense relationships. We demonstrate the effectiveness of our model on the Visual Genome dataset by achieving a recall@50 of 15% and recall@100 of 25% on the relationships present in the image. We also show that the model successfully predicts relations that are not present in the corresponding captions.
引用
收藏
页码:6597 / 6603
页数:7
相关论文
共 50 条
  • [1] BoxShrink: From Bounding Boxes to Segmentation Masks
    Groeger, Michael
    Borisov, Vadim
    Kasneci, Gjergji
    [J]. MEDICAL IMAGE LEARNING WITH LIMITED AND NOISY DATA (MILLAND 2022), 2022, 13559 : 65 - 75
  • [2] The Sound of Bounding-Boxes
    Oya, Takashi
    Iwase, Shohei
    Morishima, Shigeo
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 9 - 15
  • [3] Not All Boxes Are Equal: Learning to Optimize Bounding Boxes With Discriminative Distributions in Optical Remote Sensing Images
    Ming, Qi
    Miao, Lingjuan
    Zhou, Zhiqiang
    Vercheval, Nicolas
    Pizurica, Aleksandra
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [4] 3D object detection: Learning 3D bounding boxes from scaled down 2D bounding boxes in RGB-D images
    Rahman, Mohammad Muntasir
    Tan, Yanhao
    Xue, Jian
    Shao, Ling
    Lu, Ke
    [J]. INFORMATION SCIENCES, 2019, 476 : 147 - 158
  • [5] Explainable clustering with multidimensional bounding boxes
    Kuk, Michal
    Bobek, Szymon
    Nalepa, Grzegorz J.
    [J]. 2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
  • [6] Analyzing bounding boxes for object intersection
    Suri, S
    Hubbard, PM
    Hughes, JF
    [J]. ACM TRANSACTIONS ON GRAPHICS, 1999, 18 (03): : 257 - 277
  • [7] VORONOI DIAGRAMS WITHOUT BOUNDING BOXES
    Sang, Erik Tjong Kim
    [J]. ISPRS JOINT INTERNATIONAL GEOINFORMATION CONFERENCE 2015, 2015, II-2 (W2): : 235 - 239
  • [8] Bounds on the quality of the PCA bounding boxes
    Dimitrov, Darko
    Knauer, Christian
    Kriegel, Klaus
    Rote, Guenter
    [J]. COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2009, 42 (08): : 772 - 789
  • [9] Locating Objects Without Bounding Boxes
    Ribera, Javier
    Guera, David
    Chen, Yuhao
    Delp, Edward J.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6472 - 6482
  • [10] People silhouette extraction from people detection bounding boxes in images
    Coniglio, Christophe
    Meurie, Cyril
    Lezoray, Olivier
    Berbineau, Marion
    [J]. PATTERN RECOGNITION LETTERS, 2017, 93 : 182 - 191