Source-Free Image-Text Matching via Uncertainty-Aware Learning

被引:0
|
作者
Tian, Mengxiao [1 ,2 ]
Yang, Shuo [3 ]
Wu, Xinxiao [1 ,2 ]
Jia, Yunde [3 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Lab Intelligent Informat Technol, Beijing 100081, Peoples R China
[2] Shenzhen MSU BIT Univ, Guangdong Prov Lab Machine Percept & Intelligent C, Shenzhen 518172, Peoples R China
[3] Shenzhen MSU BIT Univ, Guangdong Prov Lab Machine Percept & Intelligent C, Shenzhen 518172, Peoples R China
关键词
Adaptation models; Uncertainty; Noise measurement; Data models; Training; Noise; Visualization; Measurement uncertainty; Computational modeling; Testing; Image-text matching; source-free adaptation; uncertainty-aware learning;
D O I
10.1109/LSP.2024.3488521
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
When applying a trained image-text matching model to a new scenario, the performance may largely degrade due to domain shift, which makes it impractical in real-world applications. In this paper, we make the first attempt on adapting the image-text matching model well-trained on a labeled source domain to an unlabeled target domain in the absence of source data, namely, source-free image-text matching. This task is challenging since it has no direct access to the source data when learning to reduce the doma in shift. To address this challenge, we propose a simple yet effective method that introduces uncertainty-aware learning to generate high-quality pseudo-pairs of image and text for target adaptation. Specifically, starting with using the pre-trained source model to retrieve several top-ranked image-text pairs from the target domain as pseudo-pairs, we then model uncertainty of each pseudo-pair by calculating the variance of retrieved texts (resp. images) given the paired image (resp. text) as query, and finally incorporate the uncertainty into an objective function to down-weight noisy pseudo-pairs for better training, thereby enhancing adaptation. This uncertainty-aware training approach can be generally applied on all existing models. Extensive experiments on the COCO and Flickr30K datasets demonstrate the effectiveness of the proposed method.
引用
收藏
页码:3059 / 3063
页数:5
相关论文
共 50 条
  • [21] A Continual Learning Framework for Uncertainty-Aware Interactive Image Segmentation
    Zheng, Ervine
    Yu, Qi
    Li, Rui
    Shi, Pengcheng
    Haake, Anne
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6030 - 6038
  • [22] Dual Relation-Aware Synergistic Attention Network for Image-Text Matching
    Qi, Shanshan
    Yang, Luxi
    Li, Chunguo
    Huang, Yongming
    2022 11TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS (ICCCAS 2022), 2022, : 251 - 256
  • [23] An end-to-end image-text matching approach considering semantic uncertainty
    Tuerhong, Gulanbaier
    Dai, Xin
    Tian, Liwei
    Wushouer, Mairidan
    NEUROCOMPUTING, 2024, 607
  • [24] Uncertainty-Aware Pedestrian Crossing Prediction via Reinforcement Learning
    Dai, Siyang
    Liu, Jun
    Cheung, Ngai-Man
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9540 - 9549
  • [25] Bi-Attention enhanced representation learning for image-text matching
    Tian, Yumin
    Ding, Aqiang
    Wang, Di
    Luo, Xuemei
    Wan, Bo
    Wang, Yifeng
    PATTERN RECOGNITION, 2023, 140
  • [26] Ambiguity-Aware and High-order Relation learning for multi-grained image-text matching
    Chen, Junyu
    Gao, Yihua
    Ge, Mingyuan
    Li, Mingyong
    KNOWLEDGE-BASED SYSTEMS, 2025, 316
  • [27] Learning Image-Text Associations
    Jiang, Tao
    Tan, Ah-Hwee
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (02) : 161 - 177
  • [28] Uncertainty-Aware RGBD Image Segmentation
    Yu, Chengxiao
    Wang, Xin
    Wang, Junqiu
    Zha, Hongbin
    2017 IEEE INTERNATIONAL CONFERENCE ON CYBORG AND BIONIC SYSTEMS (CBS), 2017, : 97 - 102
  • [29] Deep Cross-Modal Projection Learning for Image-Text Matching
    Zhang, Ying
    Lu, Huchuan
    COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 707 - 723
  • [30] ITContrast: contrastive learning with hard negative synthesis for image-text matching
    Wu, Fangyu
    Wang, Qiufeng
    Wang, Zhao
    Yu, Siyue
    Li, Yushi
    Zhang, Bailing
    Lim, Eng Gee
    VISUAL COMPUTER, 2024, 40 (12): : 8825 - 8838