Efficient discovery of unusual patterns in time series

被引:7
|
作者
Lonardi S. [1 ]
Lin J. [2 ]
Keogh E. [1 ]
Chiu B. [1 ]
机构
[1] Department of Computer Science and Engineering, University of California, Riverside
[2] Department of Information and Software Engineering, George Mason University, Fairfax
基金
美国国家科学基金会;
关键词
Anomaly detection; Feature extraction; Markov model; Novelty detection; Suffix tree; Time series;
D O I
10.1007/s00354-006-0004-2
中图分类号
学科分类号
摘要
The problem of finding a specified pattern in a time series database (i.e., query by content) has received much attention and is now a relatively mature field. In contrast, the important problem of enumerating all surprising or interesting patterns has received far less attention. This problem requires a meaningful definition of "surprise", and an efficient search technique. All previous attempts at finding surprising patterns in time series use a very limited notion of surprise, and/or do not scale to massive datasets. To overcome these limitations we propose a novel technique that defines a pattern surprising if the frequency of its occurrence differs substantially from that expected by chance, given some previously seen data. This notion has the advantage of not requiring the user to explicitly define what is a surprising pattern, which may be hard, or perhaps impossible, to elicit from a domain expert. Instead, the user gives the algorithm a collection of previously observed "normal" data. Our algorithm uses a suffix tree to efficiently encode the frequency of all observed patterns and allows a Markov model to predict the expected frequency of previously unobserved patterns. Once the suffix tree has been constructed, a measure of surprise for all the patterns in a new database can be determined in time and space linear in the size of the database. We demonstrate the utility of our approach with an extensive experimental evaluation. ©Ohmsha, Ltd. 2007.
引用
收藏
页码:61 / 93
页数:32
相关论文
共 50 条
  • [1] Efficient discovery of unusual patterns in time series
    Lonardi, Stefano
    Lin, Jessica
    Keogh, Eamonn
    Chiu, Bill 'Yuan-chi'
    [J]. NEW GENERATION COMPUTING, 2007, 25 (01) : 61 - 93
  • [2] Comprehensive and efficient discovery of time series motifs
    Lian-hua Chi
    He-hua Chi
    Yu-cai Feng
    Shu-liang Wang
    Zhong-sheng Cao
    [J]. Journal of Zhejiang University SCIENCE C, 2011, 12 : 1000 - 1009
  • [3] Efficient Shapelet Discovery for Time Series Classification
    Li, Guozhong
    Choi, Byron
    Xu, Jianliang
    Bhowmick, Sourav S.
    Chun, Kwok-Pan
    Wong, Grace Lai-Hung
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (03) : 1149 - 1163
  • [4] Comprehensive and efficient discovery of time series motifs
    Chi, Lian-hua
    Chi, He-hua
    Feng, Yu-cai
    Wang, Shu-liang
    Cao, Zhong-sheng
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2011, 12 (12): : 1000 - 1009
  • [5] Comprehensive and efficient discovery of time series motifs
    Lian-hua CHI 1
    [J]. Frontiers of Information Technology & Electronic Engineering, 2011, (12) : 1000 - 1009
  • [6] Discovery of patterns preceding earthquakes in Chilean time series
    Florido, E.
    Martinez-Alvarez, F.
    Aznarte, J. L.
    Morales-Esteban, A.
    Reyes, J.
    Troncoso, A.
    [J]. INTERNATIONAL WORK-CONFERENCE ON TIME SERIES (ITISE 2014), 2014, : 819 - 826
  • [7] Efficient Proper Length Time Series Motif Discovery
    Yingchareonthawornchai, Sorrachai
    Sivaraks, Haemwaan
    Rakthanmanon, Thanawin
    Ratanamahatana, Chotirat Ann
    [J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 1265 - 1270
  • [8] Semantic Discord: Finding Unusual Local Patterns for Time Series
    Zhang, Li
    Gao, Yifeng
    Lin, Jessica
    [J]. PROCEEDINGS OF THE 2020 SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM), 2020, : 136 - 144
  • [9] Detecting Unusual Temporal Patterns in Fisheries Time Series Data
    Wagner, Tyler
    Midway, Stephen R.
    Vidal, Tiffany
    Irwin, Brian J.
    Jackson, James R.
    [J]. TRANSACTIONS OF THE AMERICAN FISHERIES SOCIETY, 2016, 145 (04) : 786 - 794
  • [10] Privacy-preserving discovery of frequent patterns in time series
    da Silva, Josenildo Costa
    Klusch, Matthias
    [J]. ADVANCES IN DATA MINING: THEORETICAL ASPECTS AND APPLICATIONS, PROCEEDINGS, 2007, 4597 : 318 - +