Spatiotemporal contrastive modeling for video moment retrieval

被引：1

作者：

Wang, Yi ^{[1
,2
]}

Li, Kun ^{[1
,2
]}

Chen, Guoliang ^{[1
,2
]}

Zhang, Yan ^{[1
,2
]}

Guo, Dan ^{[1
,2
]}

Wang, Meng ^{[1
,2
]}

机构：

[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230601, Anhui, Peoples R China

[2] Hefei Univ Technol, Sch Artificial Intelligence, Hefei 230601, Anhui, Peoples R China

来源：

WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2023年 / 26卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Video moment retrieval; Spatiotemporal modeling; Contrastive learning; Language query; Temporal localization; ACTION RECOGNITION;

D O I：

10.1007/s11280-022-01105-3

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the rapid development of social networks, video data has been growing explosively. As one of the important social mediums, spatiotemporal characteristics of videos have attracted considerable attention in recommendation system and video understanding. In this paper, we discuss the video moment retrieval (VMR) task, which locates moments in a video based on different textual queries. Existing methods are of two pipelines: 1) proposal-free approaches are mainly in modifying multi-modal interaction strategy; 2) proposal-based methods are dedicated to designing advanced proposal generation paradigm. Recently, contrastive representation learning has been successfully applied to the field of video understanding. From a new perspective, we propose a new VMR framework, named spatiotemporal contrastive network (STCNet), to learn discriminative boundary features of video grounding by contrast learning. To be specific, we propose a boundary matching sampling module for dense negative sample sampling. The contrast learning can refine the feature representations in the training phase without any additional cost in inference. On three public datasets, Charades-STA, ActivityNet Captions and TACoS, our proposed method performs competitive performance.

引用

页码：1525 / 1544

页数：20

共 50 条

[31] Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval
Wang, Jinpeng
Chen, Bin
Liao, Dongliang
Zeng, Ziyun
Li, Gongfu
Xia, Shu-Tao
Xu, Jin
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 3020 - 3030
[32] Expert-guided contrastive learning for video-text retrieval
Lee, Jewook
Lee, Pilhyeon
Park, Sungho
Byun, Hyeran
NEUROCOMPUTING, 2023, 536 : 50 - 58
[33] Cross-Modal Contrastive Hashing Retrieval for Infrared Video and EEG
Han, Jianan
Zhang, Shaoxing
Men, Aidong
Chen, Qingchao
SENSORS, 2022, 22 (22)
[34] Semantic Relevance Learning for Video-Query Based Video Moment Retrieval
Huo, Shuwei
Zhou, Yuan
Wang, Ruolin
Xiang, Wei
Kung, Sun-Yuan
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9290 - 9301
[35] Integrating Video Retrieval and Moment Detection in a Unified Corpus for Video Question Answering
Luo, Hongyin
Mohtarami, Mitra
Glass, James
Krishnanzurthy, Karthik
Richardson, Brigitte
INTERSPEECH 2019, 2019, : 599 - 603
[36] Learning Unsupervised Visual Representations using 3D Convolutional Autoencoder with Temporal Contrastive Modeling for Video Retrieval
Kumar, Vidit
Tripathi, Vikas
Pant, Bhaskar
INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2022, 7 (02) : 272 - 287
[37] Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning
Zhu, Minghao
Lin, Xiao
Dang, Ronghao
Liu, Chengju
Chen, Qijun
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4725 - 4736
[38] Mining spatiotemporal video patterns towards robust action retrieval
Cao, Liujuan
Ji, Rongrong
Gao, Yue
Liu, Wei
Tian, Qi
NEUROCOMPUTING, 2013, 105 : 61 - 69
[39] Spatiotemporal retrieval of dynamic video object trajectories in geographical scenes
Xie, Yujia
Wang, Meizhen
Liu, Xuejun
Wang, Ziran
Mao, Bo
Wang, Feiyue
Wang, Xiaozhi
TRANSACTIONS IN GIS, 2021, 25 (01) : 450 - 467
[40] Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning
Zeng Y.
Cao D.
Lu S.
Zhang H.
Xu J.
Qin Z.
ACM Transactions on Multimedia Computing, Communications and Applications, 2022, 18 (02)

← 1 2 3 4 5 →