WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-Only Supervised Text Spotting

被引:0
|
作者
Wu, Jingjing [1 ]
Fang, Zhengyao [1 ]
Lyu, Pengyuan [2 ]
Zhang, Chengquan [2 ]
Chen, Fanglin [1 ]
Lu, Guangming [1 ]
Pei, Wenjie [1 ]
机构
[1] Harbin Inst Technol, Shenzhen, Peoples R China
[2] Baidu Inc, Dept Comp Vis Technol, Beijing, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Transcription-only supervised text spotting; Weakly supervised cross-modality contrastive learning;
D O I
10.1007/978-3-031-72751-1_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transcription-only Supervised Text Spotting aims to learn text spotters relying only on transcriptions but no text boundaries for supervision, thus eliminating expensive boundary annotation. The crux of this task lies in locating each transcription in scene text images without location annotations. In this work, we formulate this challenging problem as a Weakly Supervised Cross-modality Contrastive Learning problem, and design a simple yet effective model dubbed WeCromCL that is able to detect each transcription in a scene image in a weakly supervised manner. Unlike typical methods for cross-modality contrastive learning that focus on modeling the holistic semantic correlation between an entire image and a text description, our WeCromCL conducts atomistic contrastive learning to model the character-wise appearance consistency between a text transcription and its correlated region in a scene image to detect an anchor point for the transcription in a weakly supervised manner. The detected anchor points by WeCromCL are further used as pseudo location labels to guide the learning of text spotting. Extensive experiments on four challenging benchmarks demonstrate the superior performance of our model over other methods. Code will be released.
引用
收藏
页码:289 / 306
页数:18
相关论文
共 50 条
  • [41] Dense Supervised Dual-Aware Contrastive Learning for Airborne Laser Scanning Weakly Supervised Semantic Segmentation
    Luo, Ziwei
    Zeng, Tao
    Jiang, Xinyi
    Peng, Qingyu
    Ma, Ying
    Xie, Zhong
    Pan, Xiong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [42] Supervised Contrastive Learning with Term Weighting for Improving Chinese Text Classification
    Guo, Jiabao
    Zhao, Bo
    Liu, Hui
    Liu, Yifan
    Zhong, Qian
    TSINGHUA SCIENCE AND TECHNOLOGY, 2023, 28 (01): : 59 - 68
  • [43] Cross-Modality Segmentation by Self-supervised Semantic Alignment in Disentangled Content Space
    Yang, Junlin
    Li, Xiaoxiao
    Pak, Daniel
    Dvornek, Nicha C.
    Chapiro, Julius
    Lin, MingDe
    Duncan, James S.
    DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 : 52 - 61
  • [44] Weakly-Supervised Domain Adaptive Semantic Segmentation with Prototypical Contrastive Learning
    Das, Anurag
    Xian, Yongqin
    Dai, Dengxin
    Schiele, Bernt
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15434 - 15443
  • [45] A multi-strategy contrastive learning framework for weakly supervised semantic segmentation
    Yuan, Kunhao
    Schaefer, Gerald
    Lai, Yu-Kun
    Wang, Yifan
    Liu, Xiyao
    Guan, Lin
    Fang, Hui
    PATTERN RECOGNITION, 2023, 137
  • [46] Uncertainty-Guided Contrastive Learning for Weakly Supervised Point Cloud Segmentation
    Yao, Baochen
    Dong, Li
    Qiu, Xiaojie
    Song, Kangkang
    Yan, Diqun
    Peng, Chengbin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [47] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning
    Zhang, Can
    Cao, Meng
    Yang, Dongming
    Chen, Jie
    Zou, Yuexian
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16005 - 16014
  • [48] Contrastive and consistent feature learning for weakly supervised object localization and semantic segmentation
    Ki, Minsong
    Uh, Youngjung
    Lee, Wonyoung
    Byun, Hyeran
    NEUROCOMPUTING, 2021, 445 : 244 - 254
  • [49] Improving Event Representation via Simultaneous Weakly Supervised Contrastive Learning and Clustering
    Gao, Jun
    Wang, Wei
    Yu, Changlong
    Zhao, Huan
    Ng, Wilfred
    Xu, Ruifeng
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3036 - 3049
  • [50] Weakly supervised pathological whole slide image classification based on contrastive learning
    Xie, Yining
    Long, Jun
    Hou, Jianxin
    Chen, Deyun
    Guan, Guohui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (21) : 60809 - 60831