Distribution Agnostic Symbolic Representations for Time Series Dimensionality Reduction and Online Anomaly Detection

被引:5
|
作者
Bountrogiannis, Konstantinos [1 ,2 ]
Tzagkarakis, George [2 ]
Tsakalides, Panagiotis [2 ]
机构
[1] Univ Crete, Comp Sci Dept, Iraklion 70013, Greece
[2] Fdn Res & Technol Hellas, Inst Comp Sci, GR-70013 Iraklion, Greece
关键词
Time series analysis; Data mining; Anomaly detection; Aggregates; Task analysis; Quantization (signal); Market research; dynamic clustering; kernel methods; streaming data; symbolic representations; time series analysis; AGGREGATE APPROXIMATION; MEAN SHIFT;
D O I
10.1109/TKDE.2022.3174630
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the importance of the lower bounding distances and the attractiveness of symbolic representations, the family of symbolic aggregate approximations (SAX) has been used extensively for encoding time series data. However, typical SAX-based methods rely on two restrictive assumptions; the Gaussian distribution and equiprobable symbols. This paper proposes two novel data-driven SAX-based symbolic representations, distinguished by their discretization steps. The first representation, oriented for general data compaction and indexing scenarios, is based on the combination of kernel density estimation and Lloyd-Max quantization to minimize the information loss and mean squared error in the discretization step. The second method, oriented for high-level mining tasks, employs the Mean-Shift clustering method and is shown to enhance anomaly detection in the lower-dimensional space. Besides, we verify on a theoretical basis a previously observed phenomenon of the intrinsic process that results in a lower than the expected variance of the intermediate piecewise aggregate approximation. This phenomenon causes an additional information loss but can be avoided with a simple modification. The proposed representations possess all the attractive properties of the conventional SAX method. Furthermore, experimental evaluation on real-world datasets demonstrates their superiority compared to the traditional SAX and an alternative data-driven SAX variant.
引用
收藏
页码:5752 / 5766
页数:15
相关论文
共 50 条
  • [31] Online Multivariate Time Series Anomaly Detection Method Based on Contrastive Learning
    Dong, Xiyao
    Liu, Hui
    Du, Junzhao
    Wang, Zhengkai
    Wang, Cheng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XIII, ICIC 2024, 2024, 14874 : 468 - 479
  • [32] Symbolic Dynamic Filtering for Online Power Quality Anomaly Detection
    Saxena, Kritika
    Gurrala, Gurunath
    Joseph, Francis C.
    Teja, Balla Ravi
    2020 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2020,
  • [33] A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic
    Huang, Tingshan
    Sethu, Harish
    Kandasamy, Nagarajan
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2016, 13 (03): : 651 - 665
  • [34] Dimensionality Reduction and Anomaly Detection for CPPS Data using Autoencoder
    Eiteneuer, Benedikt
    Hranisavljevic, Nemanja
    Niggemann, Oliver
    2019 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2019, : 1286 - 1292
  • [35] Anomaly detection using symbolic time series analysis based on probability density space partitioning
    Hu, Shi-Jie
    Qian, Yu-Ning
    Yan, Ru-Qiang
    Zhendong Gongcheng Xuebao/Journal of Vibration Engineering, 2014, 27 (05): : 780 - 784
  • [36] Model Agnostic Bayesian Framework for Online Anomaly/Event Detection in PMU Data
    Hossain, Kamij R.
    Mahapatra, Kaveri
    Ogle, James P.
    2023 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, PESGM, 2023,
  • [37] Online anomaly search in time series: significant online discords
    Paolo Avogadro
    Luca Palonca
    Matteo Alessandro Dominoni
    Knowledge and Information Systems, 2020, 62 : 3083 - 3106
  • [38] Online anomaly search in time series: significant online discords
    Avogadro, Paolo
    Palonca, Luca
    Dominoni, Matteo Alessandro
    KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (08) : 3083 - 3106
  • [39] Time Series Representation for Anomaly Detection
    Leng, Mingwei
    Lai, Xinsheng
    Tan, Guolv
    Xu, Xiaohui
    2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 2, 2009, : 628 - 632
  • [40] Toolkit for Time Series Anomaly Detection
    Patel, Dhaval
    Dzung Phan
    Mueller, Markus
    Rajasekharan, Amaresh
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4812 - 4813