Similarity Maps for Self-Training Weakly-Supervised Phrase Grounding

被引:1
|
作者
Shaharabany, Tal [1 ]
Wolf, Lior [1 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
关键词
D O I
10.1109/CVPR52729.2023.00669
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A phrase grounding model receives an input image and a text phrase and outputs a suitable localization map. We present an effective way to refine a phrase ground model by considering self-similarity maps extracted from the latent representation of the model's image encoder. Our main insights are that these maps resemble localization maps and that by combining such maps, one can obtain useful pseudo-labels for performing self-training. Our results surpass, by a large margin, the state of the art in weakly supervised phrase grounding. A similar gap in performance is obtained for a recently proposed downstream task called WWbL, in which only the image is input, without any text. Our code is available at https://github.com/talshaharabany/Similarity-Maps-forSelf-Training-Weakly-Supervised- Phrase-Grounding.
引用
收藏
页码:6925 / 6934
页数:10
相关论文
共 50 条
  • [1] Weakly-Supervised Self-Training for Breast Cancer Localization
    Liang, Gongbo
    Wang, Xiaoqin
    Zhang, Yu
    Jacobs, Nathan
    [J]. 42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 1124 - 1127
  • [2] Weakly-Supervised Semantic Segmentation via Self-training
    Cheng, Hao
    Gu, Chaochen
    Wu, Kaijie
    [J]. 2020 4TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2020), 2020, 1487
  • [3] MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
    Wang, Qinxin
    Tan, Hao
    Shen, Sheng
    Mahoney, Michael W.
    Yao, Zhewei
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2030 - 2038
  • [4] Weakly Supervised Temporal Sentence Grounding with Uncertainty-Guided Self-training
    Huang, Yifei
    Yang, Lijin
    Sato, Yoichi
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18908 - 18918
  • [5] Universal Lesion Detection and Classification Using Limited Data and Weakly-Supervised Self-training
    Naga, Varun
    Mathai, Tejas Sudharshan
    Paul, Angshuman
    Summers, Ronald M.
    [J]. MEDICAL IMAGE LEARNING WITH LIMITED AND NOISY DATA (MILLAND 2022), 2022, 13559 : 55 - 64
  • [6] A SELF-TRAINING WEAKLY-SUPERVISED FRAMEWORK FOR PATHOLOGIST-LIKE HISTOPATHOLOGICAL IMAGE ANALYSIS
    Launet, Laetitia
    Colomer, Adrian
    Mosquera-Zamudio, Andres
    Moscardo, Anais
    Monteagudo, Carlos
    Naranjo, Valery
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3401 - 3405
  • [7] Robust Semi-Supervised Traffic Sign Recognition via Self-Training and Weakly-Supervised Learning
    Nartey, Obed Tettey
    Yang, Guowu
    Asare, Sarpong Kwadwo
    Wu, Jinzhao
    Frempong, Lady Nadia
    [J]. SENSORS, 2020, 20 (09)
  • [8] Abnormal Ratios Guided Multi-Phase Self-Training for Weakly-Supervised Video Anomaly Detection
    Shi, Haoyue
    Wang, Le
    Zhou, Sanping
    Hua, Gang
    Tang, Wei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5575 - 5587
  • [9] Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
    Chen, Kan
    Gao, Jiyang
    Nevatia, Ram
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4042 - 4050
  • [10] Iterative Proposal Refinement for Weakly-Supervised Video Grounding
    School of Electronic and Computer Engineering, Peking University, China
    不详
    不详
    不详
    [J]. Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, (6524-6534): : 6524 - 6534