Spatiotemporal contrastive modeling for video moment retrieval

被引：1

作者：

Wang, Yi ^{[1
,2
]}

Li, Kun ^{[1
,2
]}

Chen, Guoliang ^{[1
,2
]}

Zhang, Yan ^{[1
,2
]}

Guo, Dan ^{[1
,2
]}

Wang, Meng ^{[1
,2
]}

机构：

[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230601, Anhui, Peoples R China

[2] Hefei Univ Technol, Sch Artificial Intelligence, Hefei 230601, Anhui, Peoples R China

来源：

WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2023年 / 26卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Video moment retrieval; Spatiotemporal modeling; Contrastive learning; Language query; Temporal localization; ACTION RECOGNITION;

D O I：

10.1007/s11280-022-01105-3

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the rapid development of social networks, video data has been growing explosively. As one of the important social mediums, spatiotemporal characteristics of videos have attracted considerable attention in recommendation system and video understanding. In this paper, we discuss the video moment retrieval (VMR) task, which locates moments in a video based on different textual queries. Existing methods are of two pipelines: 1) proposal-free approaches are mainly in modifying multi-modal interaction strategy; 2) proposal-based methods are dedicated to designing advanced proposal generation paradigm. Recently, contrastive representation learning has been successfully applied to the field of video understanding. From a new perspective, we propose a new VMR framework, named spatiotemporal contrastive network (STCNet), to learn discriminative boundary features of video grounding by contrast learning. To be specific, we propose a boundary matching sampling module for dense negative sample sampling. The contrast learning can refine the feature representations in the training phase without any additional cost in inference. On three public datasets, Charades-STA, ActivityNet Captions and TACoS, our proposed method performs competitive performance.

引用

页码：1525 / 1544

页数：20

共 50 条

[1] Spatiotemporal contrastive modeling for video moment retrieval
Yi Wang
Kun Li
Guoliang Chen
Yan Zhang
Dan Guo
Meng Wang
World Wide Web, 2023, 26 : 1525 - 1544
[2] Video Moment Retrieval with Hierarchical Contrastive Learning
Zhang, Bolin
Yang, Chao
Jiang, Bin
Zhou, Xiaokang
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
[3] Video Corpus Moment Retrieval with Contrastive Learning
Zhang, Hao
Sun, Aixin
Jing, Wei
Nan, Guoshun
Zhen, Liangli
Zhou, Joey Tianyi
Goh, Rick Siow Mong
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 685 - 695
[4] Momentum Cross-Modal Contrastive Learning for Video Moment Retrieval
Han, De
Cheng, Xing
Guo, Nan
Ye, Xiaochun
Rainer, Benjamin
Priller, Peter
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5977 - 5994
[5] Hybrid Spatiotemporal Contrastive Representation Learning for Content-Based Surgical Video Retrieval
Kumar, Vidit
Tripathi, Vikas
Pant, Bhaskar
Alshamrani, Sultan S.
Dumka, Ankur
Gehlot, Anita
Singh, Rajesh
Rashid, Mamoon
Alshehri, Abdullah
AlGhamdi, Ahmed Saeed
ELECTRONICS, 2022, 11 (09)
[6] Spatiotemporal Contrastive Video Representation Learning
Qian, Rui
Meng, Tianjian
Gong, Boqing
Yang, Ming-Hsuan
Wang, Huisheng
Belongie, Serge
Cui, Yin
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6960 - 6970
[7] Adversarial Video Moment Retrieval by Jointly Modeling Ranking and Localization
Cao, Da
Zeng, Yawen
Wei, Xiaochi
Nie, Liqiang
Hong, Richang
Qin, Zheng
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 898 - 906
[8] Fast Video Moment Retrieval
Gao, Junyu
Xu, Changsheng
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1503 - 1512
[9] Survey on Video Moment Retrieval
Wang Y.
Zhan Y.-W.
Luo X.
Liu M.
Xu X.-S.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (02): : 985 - 1006
[10] Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval
Panta, Love
Shrestha, Prashant
Sapkota, Brabeem
Bhattarai, Amrita
Manandhar, Suresh
Sah, Anand Kumar
2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 617 - 624

← 1 2 3 4 5 →