CHAIN: Exploring Global-Local Spatio-Temporal Information for Improved Self-Supervised Video Hashing

被引:1
|
作者
Wei, Rukai [1 ]
Liu, Yu [1 ]
Song, Jingkuan [2 ]
Cui, Heng [1 ]
Xie, Yanzhao [1 ]
Zhou, Ke [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[2] Univ Elect Sci & Technol China, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-supervised video hashing; Spatio-temporal contrastive learning; Frame order verification; Scene change regularization;
D O I
10.1145/3581783.3613440
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compressing videos into binary codes can improve retrieval speed and reduce storage overhead. However, learning accurate hash codes for video retrieval can be challenging due to high local redundancy and complex global dependencies between video frames, especially in the absence of labels. Existing self-supervised video hashing methods have been effective in designing expressive temporal encoders, but have not fully utilized the temporal dynamics and spatial appearance of videos due to less challenging and unreliable learning tasks. To address these challenges, we begin by utilizing the contrastive learning task to capture global spatio-temporal information of videos for hashing. With the aid of our designed augmentation strategies, which focus on spatial and temporal variations to create positive pairs, the learning framework can generate hash codes that are invariant to motion, scale, and viewpoint. Furthermore, we incorporate two collaborative learning tasks, i.e., frame order verification and scene change regularization, to capture local spatio-temporal details within video frames, thereby enhancing the perception of temporal structure and the modeling of spatio-temporal relationships. Our proposed Contrastive Hashing with Global-Local Spatio-temporal Information (CHAIN) outperforms state-of-the-art self-supervised video hashing methods on four video benchmark datasets. Our codes will be released.
引用
收藏
页码:1677 / 1688
页数:12
相关论文
共 46 条
  • [21] Implicitly using Human Skeleton in Self-supervised Learning: Influence on Spatio-temporal Puzzle Solving and on Video Action Recognition
    Riand, Mathieu
    Dolle, Laurent
    Le Callet, Patrick
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ROBOTICS, COMPUTER VISION AND INTELLIGENT SYSTEMS (ROBOVIS), 2021, : 128 - 135
  • [22] A self-supervised spatio-temporal attention network for video-based 3D infant pose estimation
    Yin, Wang
    Chen, Linxi
    Huang, Xinrui
    Huang, Chunling
    Wang, Zhaohong
    Bian, Yang
    Wan, You
    Zhou, Yuan
    Han, Tongyan
    Yi, Ming
    [J]. MEDICAL IMAGE ANALYSIS, 2024, 96
  • [23] Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos
    Shen, Zhiqiang
    Sheng, Xiaoxiao
    Fan, Hehe
    Wang, Longguang
    Guo, Yulan
    Liu, Qiong
    Wen, Hao
    Zhou, Xi
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16534 - 16543
  • [24] Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
    Wang, Jiangliu
    Jiao, Jianbo
    Bao, Linchao
    He, Shengfeng
    Liu, Yunhui
    Liu, Wei
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4001 - 4010
  • [25] Self-supervised dynamic stochastic graph network for spatio-temporal wind speed forecasting
    Wu, Tangjie
    Ling, Qiang
    [J]. ENERGY, 2024, 304
  • [26] Self-Supervised Depth Completion Based on Multi-Modal Spatio-Temporal Consistency
    Zhang, Quan
    Chen, Xiaoyu
    Wang, Xingguo
    Han, Jing
    Zhang, Yi
    Yue, Jiang
    [J]. REMOTE SENSING, 2023, 15 (01)
  • [27] Hybrid self-supervised monocular visual odometry system based on spatio-temporal features
    Yuan, Shuangjie
    Zhang, Jun
    Lin, Yujia
    Yang, Lu
    [J]. ELECTRONIC RESEARCH ARCHIVE, 2024, 32 (05): : 3543 - 3568
  • [28] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
    Huang, Siyuan
    Degrees, Yichen Xie
    Zhu, Song-Chun
    Zhu, Yixin
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6515 - 6525
  • [29] Bayesian Self-Supervised Learning Using Local and Global Graph Information
    Polyzos, Konstantinos D.
    Sadeghi, Alireza
    Giannakis, Georgios B.
    [J]. 2023 IEEE 9TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING, CAMSAP, 2023, : 256 - 260
  • [30] Adherent Raindrop Removal with Self-Supervised Attention Maps and Spatio-Temporal Generative Adversarial Networks
    Alletto, Stefano
    Carlin, Casey
    Rigazio, Luca
    Ishii, Yasunori
    Tsukizawa, Sotaro
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2329 - 2338