An efficient method for time series similarity search using binary code representation and hamming distance

被引:7
|
作者
Zhang, Haowen [1 ]
Dong, Yabo [1 ]
Li, Jing [1 ]
Xu, Duanqing [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
关键词
Time series; similarity measure; binary code representation; Hamming Distance; APPROXIMATION;
D O I
10.3233/IDA-194876
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time series similarity search is an essential operation in time series data mining and has received much higher interest along with the growing popularity of time series data. Although many algorithms to solve this problem have been investigated, there is a challenging demand for supporting similarity search in a fast and accurate way. In this paper, we present a novel approach, TS2BC, to perform time series similarity search efficiently and effectively. TS2BC uses binary code to represent time series and measures the similarity under the Hamming Distance. Our method is able to represent original data compactly and can handle shifted time series and work with time series of different lengths. Moreover, it can be performed with reasonably low complexity due to the efficiency of calculating the Hamming Distance. We extensively compare TS2BC with state-of-the-art algorithms in classification framework using 61 online datasets. Experimental results show that TS2BC achieves better or comparative performance than other the state-of-the-art in accuracy and is much faster than most existing algorithms. Furthermore, we propose an approximate version of TS2BC to speed up the query procedure and test its efficiency by experiment.
引用
收藏
页码:439 / 461
页数:23
相关论文
共 50 条
  • [1] Binary code reranking method with weighted hamming distance
    Haiyan Fu
    Xiangwei Kong
    Zhenfan Wang
    [J]. Multimedia Tools and Applications, 2016, 75 : 1391 - 1408
  • [2] Binary code reranking method with weighted hamming distance
    Fu, Haiyan
    Kong, Xiangwei
    Wang, Zhenfan
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (03) : 1391 - 1408
  • [3] Multivariate Time Series Representation and Similarity Search Using PCA
    Kane, Aminata
    Shiri, Nematollaah
    [J]. ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, ICDM 2017, 2017, 10357 : 122 - 136
  • [4] Binary Code Ranking with Weighted Hamming Distance
    Zhang, Lei
    Zhang, Yongdong
    Tang, Jinhui
    Lu, Ke
    Tian, Qi
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 1586 - 1593
  • [5] A Memory-Efficient GPU Method for Hamming and Levenshtein Distance Similarity
    Todd, Andrew
    Nourian, Marziyeh
    Becchi, Michela
    [J]. 2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2017, : 408 - 418
  • [6] Isomorphism Distance in Multidimensional Time Series and Similarity Search
    Guo Wensheng
    Ji Lianen
    [J]. APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 : 209 - 217
  • [7] Hamming-Distance-Based Binary Representation of Numbers
    Qin, Minghai
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 2202 - 2205
  • [8] Hamming-Distance-Based Binary Representation of Numbers
    Qin, Minghai
    [J]. 2018 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2018,
  • [9] Similarity measures for time series data classification using grid representation and matrix distance
    Yanqing Ye
    Jiang Jiang
    Bingfeng Ge
    Yajie Dou
    Kewei Yang
    [J]. Knowledge and Information Systems, 2019, 60 : 1105 - 1134
  • [10] Similarity measures for time series data classification using grid representation and matrix distance
    Ye, Yanqing
    Jiang, Jiang
    Ge, Bingfeng
    Dou, Yajie
    Yang, Kewei
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (02) : 1105 - 1134