Efficient discovery of unusual patterns in time series

被引:7
|
作者
Lonardi S. [1 ]
Lin J. [2 ]
Keogh E. [1 ]
Chiu B. [1 ]
机构
[1] Department of Computer Science and Engineering, University of California, Riverside
[2] Department of Information and Software Engineering, George Mason University, Fairfax
基金
美国国家科学基金会;
关键词
Anomaly detection; Feature extraction; Markov model; Novelty detection; Suffix tree; Time series;
D O I
10.1007/s00354-006-0004-2
中图分类号
学科分类号
摘要
The problem of finding a specified pattern in a time series database (i.e., query by content) has received much attention and is now a relatively mature field. In contrast, the important problem of enumerating all surprising or interesting patterns has received far less attention. This problem requires a meaningful definition of "surprise", and an efficient search technique. All previous attempts at finding surprising patterns in time series use a very limited notion of surprise, and/or do not scale to massive datasets. To overcome these limitations we propose a novel technique that defines a pattern surprising if the frequency of its occurrence differs substantially from that expected by chance, given some previously seen data. This notion has the advantage of not requiring the user to explicitly define what is a surprising pattern, which may be hard, or perhaps impossible, to elicit from a domain expert. Instead, the user gives the algorithm a collection of previously observed "normal" data. Our algorithm uses a suffix tree to efficiently encode the frequency of all observed patterns and allows a Markov model to predict the expected frequency of previously unobserved patterns. Once the suffix tree has been constructed, a measure of surprise for all the patterns in a new database can be determined in time and space linear in the size of the database. We demonstrate the utility of our approach with an extensive experimental evaluation. ©Ohmsha, Ltd. 2007.
引用
收藏
页码:61 / 93
页数:32
相关论文
共 50 条
  • [31] An efficient approach to mine flexible periodic patterns in time series databases
    Chanda, Ashis Kumar
    Saha, Swapnil
    Nishi, Manziba Akanda
    Samiullah, Md.
    Ahmed, Chowdhury Farhan
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 44 : 46 - 63
  • [32] Review on Towards Efficient Mining of Recurrent Patterns in Time Series Data
    Dhale, Shimpli
    Badhiye, Sagar
    [J]. 2017 INTERNATIONAL CONFERENCE ON INNOVATIVE MECHANISMS FOR INDUSTRY APPLICATIONS (ICIMIA), 2017, : 572 - 575
  • [33] Efficient mining of understandable patterns from multivariate interval time series
    Fabian Mörchen
    Alfred Ultsch
    [J]. Data Mining and Knowledge Discovery, 2007, 15 : 181 - 215
  • [34] Genetic time series motif discovery for time series classification
    Ramanujam, E.
    Padmavathi, S.
    [J]. INTERNATIONAL JOURNAL OF BIOMEDICAL ENGINEERING AND TECHNOLOGY, 2019, 31 (01) : 47 - 63
  • [35] Efficient Discovery of Sequence Outlier Patterns
    Cao, Lei
    Yan, Yizhou
    Madden, Samuel
    Rundensteiner, Elke A.
    Gopalsamy, Mathan
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (08): : 920 - 932
  • [36] Some Efficient Segmentation-Based Techniques to Improve Time Series Discord Discovery
    Huynh Thi Thu Thuy
    Duong Tuan Anh
    Vo Thi Ngoc Chau
    [J]. NATURE OF COMPUTATION AND COMMUNICATION (ICTCC 2016), 2016, 168 : 179 - 188
  • [37] Efficient Discovery of Recurrent Routine Behaviours in Smart Meter Time Series by Growing Subsequences
    Wang, Jin
    Cardell-Oliver, Rachel
    Liu, Wei
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 522 - 533
  • [38] Fast relevance discovery in time series
    Perng, Chang-Shing
    Wang, Haixun
    Ma, Sheng
    [J]. ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 1016 - 1020
  • [39] Nonlinear Causal Discovery in Time Series
    Wu, Tianhao
    Wu, Xingyu
    Wang, Xin
    Liu, Shikang
    Chen, Huanhuan
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4575 - 4579
  • [40] Event Discovery in Astronomical Time Series
    Preston, Dan
    Protopapas, Pavlos
    Brodley, Carla
    [J]. ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XVIII, 2009, 411 : 49 - +