CHAIN: Exploring Global-Local Spatio-Temporal Information for Improved Self-Supervised Video Hashing

被引:1
|
作者
Wei, Rukai [1 ]
Liu, Yu [1 ]
Song, Jingkuan [2 ]
Cui, Heng [1 ]
Xie, Yanzhao [1 ]
Zhou, Ke [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[2] Univ Elect Sci & Technol China, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-supervised video hashing; Spatio-temporal contrastive learning; Frame order verification; Scene change regularization;
D O I
10.1145/3581783.3613440
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compressing videos into binary codes can improve retrieval speed and reduce storage overhead. However, learning accurate hash codes for video retrieval can be challenging due to high local redundancy and complex global dependencies between video frames, especially in the absence of labels. Existing self-supervised video hashing methods have been effective in designing expressive temporal encoders, but have not fully utilized the temporal dynamics and spatial appearance of videos due to less challenging and unreliable learning tasks. To address these challenges, we begin by utilizing the contrastive learning task to capture global spatio-temporal information of videos for hashing. With the aid of our designed augmentation strategies, which focus on spatial and temporal variations to create positive pairs, the learning framework can generate hash codes that are invariant to motion, scale, and viewpoint. Furthermore, we incorporate two collaborative learning tasks, i.e., frame order verification and scene change regularization, to capture local spatio-temporal details within video frames, thereby enhancing the perception of temporal structure and the modeling of spatio-temporal relationships. Our proposed Contrastive Hashing with Global-Local Spatio-temporal Information (CHAIN) outperforms state-of-the-art self-supervised video hashing methods on four video benchmark datasets. Our codes will be released.
引用
收藏
页码:1677 / 1688
页数:12
相关论文
共 46 条
  • [41] Improved background modeling of video sequences using spatio-temporal extension of fuzzy local binary pattern
    Akram Norouzi Sefidmazgi
    Manoochehr Nahvi
    [J]. Multimedia Tools and Applications, 2019, 78 : 17287 - 17316
  • [42] Self-supervised 4D Spatio-temporal Feature Learning via Order Prediction of Sequential Point Cloud Clips
    Wang, Haiyan
    Yang, Liang
    Rong, Xuejian
    Feng, Jinglun
    Tian, Yingli
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3761 - 3770
  • [43] SSAR-GNN: Self-Supervised Artist Recommendation from spatio-temporal perspectives in art history with Graph Neural Networks
    Zhang, Qinglin
    Wang, Menghan
    Wang, Haiyan
    Rao, Xuan
    Chen, Lisi
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 144 : 230 - 241
  • [44] Spatio-temporal analysis of land use/land cover change detection in small regions using self-supervised lightweight deep learning
    Naik, Nitesh
    Chandrasekaran, Kandasamy
    Meenakshi Sundaram, Venkatesan
    Panneer, Prabhavathy
    [J]. STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2023, 37 (12) : 5029 - 5049
  • [45] Spatio-temporal analysis of land use/land cover change detection in small regions using self-supervised lightweight deep learning
    Nitesh Naik
    Kandasamy Chandrasekaran
    Venkatesan Meenakshi Sundaram
    Prabhavathy Panneer
    [J]. Stochastic Environmental Research and Risk Assessment, 2023, 37 : 5029 - 5049
  • [46] How to Accurately Predict Traffic Speed Using Simple Input Variables? A Novel Self-Supervised Spatio-Temporal Bilateral Learning Network
    Zou, Guojian
    Wang, Ting
    Wang, Honggang
    Fan, Jing
    Li, Ye
    [J]. 2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 4657 - 4662