Attention-Based Video Hashing for Large-Scale Video Retrieval

被引:11
|
作者
Wang, Yingxin [1 ]
Nie, Xiushan [2 ]
Shi, Yang [3 ]
Zhou, Xin [1 ]
Yin, Yilong [3 ]
机构
[1] Shandong Univ, Sch Comp Sci & Technol, Jinan 250101, Peoples R China
[2] Shandong Jianzhu Univ, Sch Comp Sci & Technol, Jinan 250101, Peoples R China
[3] Shandong Univ, Sch Software, Jinan 250101, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolutional neural networks; Data models; Deep learning; Quantization (signal); Binary codes; Feature extraction; hashing; video hashing; video retrieval; IMAGE RETRIEVAL; BINARY; QUANTIZATION;
D O I
10.1109/TCDS.2019.2963339
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale video retrieval is a challenging problem because of the exponential growth of video collections on the Internet. To address this challenge, we propose an attention-based video hashing (AVH) method for large-scale video retrieval. Unlike most of the existing video hashing methods, which consider different frames within a video separately for hash learning, we use a convolutional neural network and long short-term memory (LSTM) network as the backbone to learn compact and discriminative hash codes by exploiting the structural information among different frames. To better capture informative clues in the video, an attention mechanism is added into the backbone, which can assign different weights to different LSTM time steps. Experiments were conducted to evaluate the proposed AVH method in comparison with existing methods. The experimental results on two widely used data sets show that our method outperforms existing state-of-the-art methods.
引用
收藏
页码:491 / 502
页数:12
相关论文
共 50 条
  • [31] Large-scale video copy retrieval with temporal-concentration SIFT
    Zhu, Yingying
    Huang, Xiaoyan
    Huang, Qiang
    Tian, Qi
    [J]. NEUROCOMPUTING, 2016, 187 : 83 - 91
  • [32] Combining Boolean and Multimedia Retrieval in vitrivr for Large-Scale Video Search
    Sauter, Loris
    Parian, Mahnaz Amiri
    Gasser, Ralph
    Heller, Silvan
    Rossetto, Luca
    Schuldt, Heiko
    [J]. MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 760 - 765
  • [33] Large-Scale Video Retrieval via Deep Local Convolutional Features
    Zhang, Chen
    Hu, Bin
    Suo, Yucong
    Zou, Zhiqiang
    Ji, Yimu
    [J]. ADVANCES IN MULTIMEDIA, 2020, 2020
  • [34] An Adaptive Search Path Traverse for Large-scale Video Frame Retrieval
    Diep Thi-Ngoc Nguyen
    Kiyoki, Yasushi
    [J]. INFORMATION MODELLING AND KNOWLEDGE BASES XXVI, 2014, 272 : 324 - 342
  • [35] TEMPORAL AGGREGATION FOR LARGE-SCALE QUERY-BY-IMAGE VIDEO RETRIEVAL
    Araujo, Andre
    Chaves, Jason
    Angst, Roland
    Girod, Bernd
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 1518 - 1522
  • [36] Attention-Based Convolutional LSTM for Describing Video
    Liu, Zhongyu
    Chen, Tian
    Ding, Enjie
    Liu, Yafeng
    Yu, Wanli
    [J]. Ding, Enjie (enjied@cumt.edu.cn), 1600, Institute of Electrical and Electronics Engineers Inc. (08): : 133713 - 133724
  • [37] Attention-Based Convolutional LSTM for Describing Video
    Liu, Zhongyu
    Chen, Tian
    Ding, Enjie
    Liu, Yafeng
    Yu, Wanli
    [J]. IEEE ACCESS, 2020, 8 : 133713 - 133724
  • [38] An Attention-based Activity Recognition for Egocentric Video
    Matsuo, Kenji
    Yamada, Kentaro
    Ueno, Satoshi
    Naito, Sei
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2014, : 565 - +
  • [39] Multimodal attention-based transformer for video captioning
    Hemalatha Munusamy
    Chandra Sekhar C
    [J]. Applied Intelligence, 2023, 53 : 23349 - 23368
  • [40] Residual Attention-based Fusion for Video Classification
    Pouyanfar, Samira
    Wang, Tianyi
    Chen, Shu-Ching
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 478 - 480