Efficient discovery of unusual patterns in time series

被引:0
|
作者
Lonardi, Stefano [1 ]
Lin, Jessica
Keogh, Eamonn
Chiu, Bill 'Yuan-chi'
机构
[1] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
[2] George Mason Univ, Dept Informat & Software Engn, Fairfax, VA 22030 USA
关键词
time series; suffix tree; novelty detection; anomaly detection; Markov model; feature extraction;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of finding a specified pattern in a time series database (i.e., query by content) has received much attention and is now a relatively mature field. In contrast, the important problem of enumerating all surprising or interesting patterns has received far less attention. This problem requires a meaningful definition of "surprise", and an efficient search technique. All previous attempts at finding surprising patterns in time series use a very limited notion of surprise, and/or do not scale to massive datasets. To overcome these limitations we propose a novel technique that defines a pattern surprising if the frequency of its occurrence differs substantially from that expected by chance, given some previously seen data. This notion has the advantage of not requiring the user to explicitly define what is a surprising pattern, which may be hard, or perhaps impossible, to elicit from a domain expert. Instead, the user gives the algorithm a collection of previously observed "normal" data. Our algorithm uses a suffix tree to efficiently encode the frequency of all observed patterns and allows a Markov model to predict the expected frequency of previously unobserved patterns. Once the suffix tree has been constructed, a measure of surprise for all the patterns in a new database can be determined in time and space linear in the size of the database. We demonstrate the utility of our approach with an extensive experimental evaluation.
引用
收藏
页码:61 / 93
页数:33
相关论文
共 50 条
  • [41] Constrained Motif Discovery in Time Series
    Mohammad, Yasser
    Nishida, Toyoaki
    [J]. NEW GENERATION COMPUTING, 2009, 27 (04) : 319 - 346
  • [42] Nonlinear Causal Discovery in Time Series
    Wu, Tianhao
    Wu, Xingyu
    Wang, Xin
    Liu, Shikang
    Chen, Huanhuan
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4575 - 4579
  • [43] Event Discovery in Astronomical Time Series
    Preston, Dan
    Protopapas, Pavlos
    Brodley, Carla
    [J]. ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XVIII, 2009, 411 : 49 - +
  • [44] Knowledge discovery in time series databases
    Last, M
    Klein, Y
    Kandel, A
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2001, 31 (01): : 160 - 169
  • [45] Discovery of Meaningful Rules in Time Series
    Shokoohi-Yekta, Mohammad
    Chen, Yanping
    Campana, Bilson
    Hu, Bing
    Zakaria, Jesin
    Keogh, Eamonn
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1085 - 1094
  • [46] Constrained Motif Discovery in Time Series
    Yasser Mohammad
    Toyoaki Nishida
    [J]. New Generation Computing, 2009, 27 : 319 - 346
  • [47] Survey on time series motif discovery
    Torkamani, Sahar
    Lohweg, Volker
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 7 (02)
  • [48] Discovery of Corrosion Patterns using Symbolic Time Series Representation and N-gram Model
    Taib, Shakirah Mohd
    Zabidi, Zahiah Akhma Mohd
    Aziz, Izzatdin Abdul
    Mousor, Farahida Hanim
    Abu Bakar, Azuraliza
    Mokhtar, Ainul Akmar
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (12) : 554 - 560
  • [49] Order patterns in time series
    Bandt, Christoph
    Shiha, Faten
    [J]. JOURNAL OF TIME SERIES ANALYSIS, 2007, 28 (05) : 646 - 665
  • [50] Efficient and Accurate Discovery of Patterns in Sequence Datasets
    Floratou, Avrilia
    Tata, Sandeep
    Patel, Jignesh M.
    [J]. 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 461 - 472