DLSTM Approach to Video Modeling with Hashing for Large-Scale Video Retrieval

被引:0
|
作者
Zhuang, Naifan [1 ]
Ye, Jun [1 ]
Hua, Kien A. [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although Query-by-Example techniques based on Euclidean distance in a multidimensional feature space have proved to be effective for image databases, this approach cannot be effectively applied to video since the number of dimensions would be massive due to the richness and complexity of video data. The above issue has been addressed in two recent solutions, namely Deterministic Quantization (DQ) and Dynamic Temporal Quantization (DTQ). DQ divides the video into equal segments and extracts a visual feature vector for each segment. The bag-of-word feature is then encoded by hashing to facilitate approximate nearest neighbor search using Hamming distance. One weakness of this approach is the deterministic segmentation of video data. DTQ improves on this by using dynamic video segmentation to obtain varied-length video segments. As a result, feature vectors extracted from these video segments can better capture the semantic content of the video. To support very large video databases, it is desirable to minimize the number of segments in order to keep the size of the feature representation as small as possible. We achieve this by using only one video segment (i.e., no video data segmentation is even necessary) with even better retrieval performance. Our scheme models video using differential long short-term memory (DLSTM) recurrent neural networks and obtains a highly compact fixed-size feature representation with the output of hidden states of the DLSTM. Each of these features are further compressed by hashing them into binary bits via quantization. Experimental results based on two public data sets, UCF101 and MSRActionPairs, indicate that the proposed video modeling technique outperforms DTQ by a significant margin.
引用
收藏
页码:3222 / 3227
页数:6
相关论文
共 50 条
  • [1] Attention-Based Video Hashing for Large-Scale Video Retrieval
    Wang, Yingxin
    Nie, Xiushan
    Shi, Yang
    Zhou, Xin
    Yin, Yilong
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (03) : 491 - 502
  • [2] Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval
    Wu, Gengshen
    Han, Jungong
    Guo, Yuchen
    Liu, Li
    Ding, Guiguang
    Ni, Qiang
    Shao, Ling
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1993 - 2007
  • [3] Classification-enhancement deep hashing for large-scale video retrieval
    Nie, Xiushan
    Zhou, Xin
    Shi, Yang
    Sun, Jiande
    Yin, Yilong
    [J]. APPLIED SOFT COMPUTING, 2021, 109
  • [4] Stochastic Multiview Hashing for Large-Scale Near-Duplicate Video Retrieval
    Hao, Yanbin
    Mu, Tingting
    Hong, Richang
    Wang, Meng
    An, Ning
    Goulermas, John Y.
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (01) : 1 - 14
  • [5] Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval
    Song, Jingkuan
    Yang, Yi
    Huang, Zi
    Shen, Heng Tao
    Luo, Jiebo
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (08) : 1997 - 2008
  • [6] Large-Scale Video Hashing via Structure Learning
    Ye, Guangnan
    Liu, Dong
    Wang, Jun
    Chang, Shih-Fu
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2272 - 2279
  • [7] Face Retrieval on Large-Scale Video Data
    Herrmann, Christian
    Beyerer, Juergen
    [J]. 2015 12TH CONFERENCE ON COMPUTER AND ROBOT VISION CRV 2015, 2015, : 192 - 199
  • [8] Joint Multi-View Hashing for Large-Scale Near-Duplicate Video Retrieval
    Nie, Xiushan
    Jing, Weizhen
    Cui, Chaoran
    Zhang, Chen Jason
    Zhu, Lei
    Yin, Yilong
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (10) : 1951 - 1965
  • [9] A Supervised Video Hashing Method Based on a Deep 3D Convolutional Neural Network for Large-Scale Video Retrieval
    Chen, Hanqing
    Hu, Chunyan
    Lee, Feifei
    Lin, Chaowei
    Yao, Wei
    Chen, Lu
    Chen, Qiu
    [J]. SENSORS, 2021, 21 (09)
  • [10] Face Retrieval in Large-Scale News Video Datasets
    Thanh Duc Ngo
    Hung Thanh Vu
    Duy-Dinh Le
    Satoh, Shin'ichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (08): : 1811 - 1825