Dual-branch contrastive learning for weakly supervised object localization

被引:0
|
作者
Guo, Zebin [1 ,2 ]
Li, Dong [1 ,2 ]
Du, Zhengjun [1 ,2 ]
Seng, Bingfeng [1 ,2 ]
机构
[1] Qinghai Univ, Sch Comp Technol & Applicat, Xining 810000, Peoples R China
[2] Intelligent Comp & Applicat Lab Qinghai Prov, Xining, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Computer vision; Weakly supervised object localization; Dual-branch network; Contrastive learning;
D O I
10.1007/s10489-025-06514-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The weakly supervised object localization task uses image-level labels to train object localization models. Traditional convolutional neural network (CNN)-based methods usually localize objects using a class activation map. However, the class activation map usually suffers from the problem of activating a small part of the object that is most discriminative. Meanwhile, the methods based on the Vision Transformer can capture long-range feature dependencies but tend to ignore local feature details. In this paper, we innovatively propose a dual-branch contrastive learning (DBC) method that consists of a Transformer and a CNN branch. The method can effectively separate the background and foreground of an image and fuse the features of Transformer and CNN through contrastive learning. Specifically, the method separates the background and foreground representations of the image using the initially generated class-agnostic activation maps. Then, the representations of the same image from different branches form positive pairs for contrastive learning. The background and foreground representations from the same branch form negative pairs. Finally, the DBC method forces the model to separate the background and foreground representations through negative contrastive loss and makes the model fuse the features of two branches through positive contrastive loss. Experiments on the ILSVRC benchmark show that the proposed method can achieve a Top-1 localization accuracy of 59.9% and a GT-known localization accuracy of 71.7%, which are better metrics than those of the state-of-the-art methods with the same parameter complexity.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] MediDRNet: Tackling category imbalance in diabetic retinopathy classification with dual-branch learning and prototypical contrastive learning
    Teng, Siying
    Wang, Bo
    Yang, Feiyang
    Yi, Xingcheng
    Zhang, Xinmin
    Sun, Yabin
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 253
  • [32] Learning Local Semantic Region Activations for Weakly Supervised Object Localization
    Xu, Can
    Hui, Le
    Han, Yuehui
    Jiang, Haobo
    Chen, Jiaxin
    Xie, Jin
    Yang, Jian
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 10182 - 10193
  • [33] Patch-wise Weakly Supervised Learning for Object Localization in Video
    Dong Huh
    Kim, Taekyung
    Kim, Jaeil
    2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 263 - 266
  • [34] Deep Self-Taught Learning for Weakly Supervised Object Localization
    Jie, Zequn
    Wei, Yunchao
    Jin, Xiaojie
    Feng, Jiashi
    Liu, Wei
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4294 - 4302
  • [35] Deep Learning for Weakly-Supervised Object Detection and Localization: A Survey
    Shao, Feifei
    Chen, Long
    Shao, Jian
    Ji, Wei
    Xiao, Shaoning
    Ye, Lu
    Zhuang, Yueting
    Xiao, Jun
    NEUROCOMPUTING, 2022, 496 : 192 - 207
  • [36] Dual-attention Guided Dropblock Module for Weakly Supervised Object Localization
    Yin, Junhui
    Zhang, Siqing
    Chang, Dongliang
    Ma, Zhanyu
    Guo, Jun
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4229 - 4236
  • [37] Weakly supervised foreground learning for weakly supervised localization and detection
    Zhang, Chen -Lin
    Li, Yin
    Wu, Jianxin
    PATTERN RECOGNITION, 2023, 137
  • [38] Precision in visual object tracking: a dual-branch approach
    Zhou, Wenjun
    Wang, Nan
    Liang, Dong
    Peng, Bo
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (02)
  • [39] A Dual-Branch CNN Structure for Deformable Object Detection
    Li, Jianjun
    Zheng, Kai
    Zhang, Xin
    Luo, Zhenxing
    Tang, Zhuo
    Chang, Ching-Chun
    Lin, Yuqi
    Tang, Peiqi
    SECURITY WITH INTELLIGENT COMPUTING AND BIG-DATA SERVICES, 2020, 895 : 784 - 797
  • [40] A Dual-branch CNN Structure for Deformable Object Detection
    Li, Jianjun
    Zheng, Kai
    Luo, Zhenxing
    Tang, Zhuo
    Chang, Ching-Chun
    JOURNAL OF INTERNET TECHNOLOGY, 2020, 21 (03): : 811 - 818