Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning

被引：42

作者：

Zheng, Minghang ^{[1
]}

Huang, Yanjie ^{[1
]}

Chen, Qingchao ^{[2
]}

Peng, Yuxin ^{[1
]}

Liu, Yang ^{[1
,3
]}

机构：

[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China

[2] Peking Univ, Natl Inst Hlth Data Sci, Beijing, Peoples R China

[3] Beijing Inst Gen Artificial Intelligence, Beijing, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52688.2022.01511

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Temporal sentence grounding aims to detect the most salient moment corresponding to the natural language query from untrimmed videos. As labeling the temporal boundaries is labor-intensive and subjective, the weakly-supervised methods have recently received increasing attention. Most of the existing weakly-supervised methods generate the proposals by sliding windows, which are content-independent and of low quality. Moreover, they train their model to distinguish positive visual-language pairs from negative ones randomly collected from other videos, ignoring the highly confusing video segments within the same video. In this paper, we propose Contrastive Proposal Learning(CPL) to overcome the above limitations. Specifically, we use multiple learnable Gaussian functions to generate both positive and negative proposals within the same video that can characterize the multiple events in a long video. Then, we propose a controllable easy to hard negative proposal mining strategy to collect negative samples within the same video, which can ease the model optimization and enables CPL to distinguish highly confusing scenes. The experiments show that our method achieves state-of-the-art performance on Charades-STA and ActivityNet Captions datasets. The code and models are available at https://github.com/minghangz/cpl.

引用

页码：15534 / 15543

页数：10

共 50 条

[31] Weakly supervised pathological whole slide image classification based on contrastive learning
Xie, Yining
Long, Jun
Hou, Jianxin
Chen, Deyun
Guan, Guohui
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (21) : 60809 - 60831
[32] Weakly supervised histopathological image representation learning based on contrastive dynamic clustering
Li, Jun
Jiang, Zhiguo
Zheng, Yushan
Zhang, Haopeng
Shi, Jun
Hu, Dingyi
Luo, Wei
Jiang, Zhongmin
Xue, Chenghai
MEDICAL IMAGING 2022: DIGITAL AND COMPUTATIONAL PATHOLOGY, 2022, 12039
[33] Grouped Contrastive Learning of Self-Supervised Sentence Representation
Wang, Qian
Zhang, Weiqi
Lei, Tianyi
Peng, Dezhong
APPLIED SCIENCES-BASEL, 2023, 13 (17):
[34] Boundary-Aware Temporal Sentence Grounding with Adaptive Proposal Refinement
Dong, Jianxiang
Yin, Zhaozheng
COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 641 - 657
[35] INVESTIGATING POOLING STRATEGIES AND LOSS FUNCTIONS FOR WEAKLY-SUPERVISED TEXT-TO-AUDIO GROUNDING VIA CONTRASTIVE LEARNING
Xu, Xuenan
Wu, Mengyue
Yu, Kai
2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
[36] Inverse Compositional Learning for Weakly-supervised Relation Grounding
Li, Huan
Wei, Ping
Ma, Zeyu
Zheng, Nanning
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15431 - 15441
[37] A Dual Reinforcement Learning Framework for Weakly Supervised Phrase Grounding
Wang, Zhiyu
Yang, Chao
Jiang, Bin
Yuan, Junsong
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 394 - 405
[38] Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding
Mo, Shentong
Liu, Daizong
Hu, Wei
arXiv, 2022,
[39] Weakly-Supervised Contrastive Learning for Unsupervised Object Discovery
Lv, Yunqiu
Zhang, Jing
Barnes, Nick
Dai, Yuchao
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2689 - 2702
[40] Consistent prototype contrastive learning for weakly supervised person search
Lin, Huadong
Yu, Xiaohan
Zhang, Pengcheng
Bai, Xiao
Zheng, Jin
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 105

← 1 2 3 4 5 →