A Novel Parallel Scheme for Fast Similarity Search in Large Time Series

被引:9
|
作者
Yin Hong [1 ,3 ]
Yang Shuqiang [1 ]
Ma Shaodong [2 ]
Liu Fei [1 ]
Chen Zhikun [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
[2] Univ Hull, Sch Engn, Kingston Upon Hull HU6 7RX, N Humberside, England
[3] Xiangyang Sch NCOs, Xiangyang 441118, Peoples R China
基金
中国国家自然科学基金;
关键词
similarity; DTW; warping path; time series; MapReduce; parallelization; cluster;
D O I
10.1109/CC.2015.7084408
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
The similarity search is one of the fundamental components in time series data mining, e.g. clustering, classification, association rules mining. Many methods have been proposed to measure the similarity between time series, including Euclidean distance, Manhattan distance, and dynamic time warping (DTW). In contrast, DTW has been suggested to allow more robust similarity measure and be able to find the optimal alignment in time series. However, due to its quadratic time and space complexity, DTW is not suitable for large time series datasets. Many improving algorithms have been proposed for DTW search in large databases, such as approximate search or exact indexed search. Unlike the previous modified algorithm, this paper presents a novel parallel scheme for fast similarity search based on DTW, which is called MRDTW (MapRedcue-based DTW). The experimental results show that our approach not only retained the original accuracy as DTW, but also greatly improved the efficiency of similarity measure in large time series.
引用
收藏
页码:129 / 140
页数:12
相关论文
共 50 条
  • [41] A parallel series approximation scheme for a fast floating point divider
    Choo, I
    Deshmukh, RG
    IEEE SOUTHEASTCON 2001: ENGINEERING THE FUTURE, PROCEEDINGS, 2001, : 202 - 207
  • [42] Combining fast search and learning for fast similarity search
    Vassef, H
    Li, CS
    Castelli, V
    STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2000, 2000, 3972 : 32 - 42
  • [43] Parallel Algorithm for Local-best-match Time Series Subsequence Similarity Search on the Intel MIC Architecture
    Movchan, Aleksander V.
    Zymbler, Mikhail L.
    4TH INTERNATIONAL YOUNG SCIENTIST CONFERENCE ON COMPUTATIONAL SCIENCE, 2015, 66 : 63 - 72
  • [44] Interval-focused similarity search in time series databases
    Assfalg, Johannes
    Kriegel, Hans-Peter
    Kroeger, Peer
    Kunath, Peter
    Pryakhin, Alexey
    Renz, Matthias
    ADVANCES IN DATABASES: CONCEPTS, SYSTEMS AND APPLICATIONS, 2007, 4443 : 586 - +
  • [45] Trend and Value based Time Series Representation for Similarity Search
    Kane, Aminata
    2017 IEEE THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2017), 2017, : 252 - 259
  • [46] Anticipatory DTW for Efficient Similarity Search in Time Series Databases
    Assent, Ira
    Wichterich, Marc
    Krieger, Ralph
    Kremer, Hardy
    Seidl, Thomas
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (01):
  • [47] Efficient similarity search over future stream time series
    Lian, Xiang
    Chen, Lei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (01) : 40 - 54
  • [48] Similarity Search on Financial Time Series based on DTW and NMF
    Liu, Zunxiong
    Zhou, Tianqing
    PROCEEDINGS OF 2010 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND INDUSTRIAL ENGINEERING, VOLS I AND II, 2010, : 1112 - 1116
  • [49] Multivariate Time Series Representation and Similarity Search Using PCA
    Kane, Aminata
    Shiri, Nematollaah
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, ICDM 2017, 2017, 10357 : 122 - 136
  • [50] Time Series Similarity Search based on Middle Points and Clipping
    Nguyen Thanh Son
    Duong Tuan Anh
    2011 3RD CONFERENCE ON DATA MINING AND OPTIMIZATION (DMO), 2011, : 13 - 19