Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

被引:0
|
作者
Javed, Syed Ashar [1 ]
Saxena, Shreyas
Gandhi, Vineet [2 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
[2] IIIT Hyderabad, CVIT, Kohli Ctr Intelligent Syst KCIS, Hyderabad, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities. In the unsupervised setting, lack of supervisory signals exacerbate this difficulty. In this paper, we propose a novel framework for unsupervised visual grounding which uses concept learning as a proxy task to obtain self-supervision. The intuition behind this idea is to encourage the model to localize to regions which can explain some semantic property in the data, in our case, the property being the presence of a concept in a set of images We present thorough quantitative and qualitative experiments to demonstrate the efficacy of our approach and show a 5.6% improvement over the current state of the art on Visual Genome dataset, a 5.8% improvement on the ReferItGame dataset and comparable to state-of-art performance on the Flickr30k dataset.
引用
收藏
页码:796 / 802
页数:7
相关论文
共 50 条
  • [31] Better Self-training for Image Classification Through Self-supervision
    Sahito, Attaullah
    Frank, Eibe
    Pfahringer, Bernhard
    AI 2021: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13151 : 645 - 657
  • [32] Self-supervision, surveillance and transgression
    Simon, Gail
    JOURNAL OF FAMILY THERAPY, 2010, 32 (03) : 308 - 325
  • [33] Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
    Yuan, Liangzhe
    Qian, Rui
    Cui, Yin
    Gong, Boqing
    Schroff, Florian
    Yang, Ming-Hsuan
    Adam, Hartwig
    Liu, Ting
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13957 - 13966
  • [34] Task-specific image summaries using semantic information and self-supervision
    Deepak Kumar Sharma
    Anurag Singh
    Sudhir Kumar Sharma
    Gautam Srivastava
    Jerry Chun-Wei Lin
    Soft Computing, 2022, 26 : 7581 - 7594
  • [35] Anomalies, representations, and self-supervision
    Dillon, Barry M.
    Favaro, Luigi
    Feiden, Friedrich
    Modak, Tanmoy
    Plehn, Tilman
    SCIPOST PHYSICS CORE, 2024, 7 (03):
  • [36] MetaDetector: Detecting Outliers by Learning to Learn from Self-supervision
    Tan, Jeremy
    Kart, Turkay
    Hou, Benjamin
    Batten, James
    Kainz, Bernhard
    BIOMEDICAL IMAGE REGISTRATION, DOMAIN GENERALISATION AND OUT-OF-DISTRIBUTION ANALYSIS, 2022, 13166 : 119 - 126
  • [37] Symmetries, safety, and self-supervision
    Dillon, Barry M.
    Kasieczka, Gregor
    Olischlaeger, Hans
    Plehn, Tilman
    Sorrenson, Peter
    Vogel, Lorenz
    SCIPOST PHYSICS, 2022, 12 (06):
  • [38] Self-Supervision: Psychodynamic Strategies
    Brenner, Ira
    JOURNAL OF THE AMERICAN PSYCHOANALYTIC ASSOCIATION, 2024, 72 (02)
  • [39] FedGL: Federated graph learning framework with global self-supervision
    Chen, Chuan
    Xu, Ziyue
    Hu, Weibo
    Zheng, Zibin
    Zhang, Jie
    INFORMATION SCIENCES, 2024, 657
  • [40] DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision
    Wallin, Erik
    Svensson, Lennart
    Kahl, Fredrik
    Hammarstrand, Lars
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2871 - 2877