Ranking and significance of variable-length similarity-based time series motifs

被引:3
|
作者
Serra, Joan [1 ,2 ]
Serra, Isabel [3 ]
Corral, Alvaro [3 ]
Lluis Arcos, Josep [2 ]
机构
[1] Telefon Res, Barcelona, Spain
[2] Artificial Intelligence Res Inst IIIA CSIC, Barcelona, Spain
[3] Ctr Recerca Matemat, Barcelona, Spain
关键词
Time series; Motif ranking; Distance modeling; Beta distribution; CLASSIFICATION;
D O I
10.1016/j.eswa.2016.02.026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The detection of very similar patterns in a time series, commonly called motifs, has received continuous and increasing attention from diverse scientific communities. In particular, recent approaches for discovering similar motifs of different lengths have been proposed. In this work, we show that such variable-length similarity-based motifs cannot be directly compared, and hence ranked, by their normalized dissimilarities. Specifically, we find that length-normalized motif dissimilarities still have intrinsic dependencies on the motif length, and that lowest dissimilarities are particularly affected by this dependency. Moreover, we find that such dependencies are generally non-linear and change with the considered data set and dissimilarity measure. Based on these findings, we propose a solution to rank (previously obtained) motifs of different lengths and measure their significance. This solution relies on a compact but accurate model of the dissimilarity space, using a beta distribution with three parameters that depend on the motif length in a non-linear way. We believe the incomparability of variable-length dissimilarities could have an impact beyond the field of time series, and that similar modeling strategies as the one used here could be of help in a more broad context and in diverse application scenarios. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:452 / 460
页数:9
相关论文
共 50 条
  • [1] Iterative Grammar-based Framework for Discovering Variable-Length Time Series Motifs
    Gao, Yifeng
    Lin, Jessica
    Rangwala, Huzefa
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 111 - 116
  • [2] An Efficient Method for Discovering Variable-length Motifs in Time Series based on Suffix Array
    Nguyen Ngoc Phien
    Nguyen Trong Nhan
    Duong Tuan Anh
    SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 125 - 131
  • [3] Iterative Grammar-based Framework for Discovering Variable-Length Time Series Motifs
    Gao, Yifeng
    Lin, Jessica
    Rangwala, Huzefa
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 7 - 12
  • [4] Exploring variable-length time series motifs in one hundred million length scale
    Yifeng Gao
    Jessica Lin
    Data Mining and Knowledge Discovery, 2018, 32 : 1200 - 1228
  • [5] Exploring variable-length time series motifs in one hundred million length scale
    Gao, Yifeng
    Lin, Jessica
    DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (05) : 1200 - 1228
  • [6] A Variable-Length Motifs Discovery Method in Time Series using Hybrid Approach
    Zan, Chaw Thet
    Yamana, Hayato
    19TH INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES (IIWAS2017), 2017, : 49 - 57
  • [7] HIME: discovering variable-length motifs in large-scale time series
    Gao, Yifeng
    Lin, Jessica
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 61 (01) : 513 - 542
  • [8] HIME: discovering variable-length motifs in large-scale time series
    Yifeng Gao
    Jessica Lin
    Knowledge and Information Systems, 2019, 61 : 513 - 542
  • [9] Variable-Length Subsequence Clustering in Time Series
    Duan, Jiangyong
    Guo, Lili
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (02) : 983 - 995
  • [10] Scalable, Variable-Length Similarity Search in Data Series: The ULISSE Approach
    Linardi, Michele
    Palpanas, Themis
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (13): : 2236 - 2248