Improving discretization based pattern discovery for multivariate time series by additional preprocessing

被引:1
|
作者
Noering, Fabian Kai-Dietrich [1 ]
Jonas, Konstantin [2 ]
Klawonn, Frank [3 ,4 ]
机构
[1] Volkswagen AG, Wolfsburg, Germany
[2] Deutsch Bahn AG, Volkswagen AG, Berlin, Germany
[3] Ostfalia Univ Appl Sci, Dept Comp Sci, Wolfenbuttel, Germany
[4] Helmholtz Ctr Infect Res, Braunschweig, Germany
关键词
Time series data mining; pattern discovery; motif discovery; variable pattern length; unsupervised; multivariate; LINEAR-TIME; MOTIFS;
D O I
10.3233/IDA-205329
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In technical systems the analysis of similar load situations is a promising technique to gain information about the system's state, its health or wearing. Very often, load situations are challenging to be defined by hand. Hence, these situations need to be discovered as recurrent patterns within multivariate time series data of the system under consideration. Unsupervised algorithms for finding such recurrent patterns in multivariate time series must be able to cope with very large data sets because the system might be observed over a very long time. In our previous work we identified discretization-based approaches to be very interesting for variable length pattern discovery because of their low computing time due to the simplification (symbolization) of the time series. In this paper we propose additional preprocessing steps for symbolic representation of time series aiming for enhanced multivariate pattern discovery. Beyond that we show the performance (quality and computing time) of our algorithms in a synthetic test data set as well as in a real life example with 100 millions of time points. We also test our approach with increasing dimensionality of the time series.
引用
收藏
页码:1051 / 1072
页数:22
相关论文
共 50 条
  • [21] Temporal Pattern Mining for Multivariate Time Series Classification
    Dua, Sumeet
    Saini, Sheetal
    Singh, Harpreet
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2011, 1 (02) : 164 - 169
  • [22] Temporal pattern attention for multivariate time series forecasting
    Shih, Shun-Yao
    Sun, Fan-Keng
    Lee, Hung-yi
    MACHINE LEARNING, 2019, 108 (8-9) : 1421 - 1441
  • [23] A Cluster-based Genetic Approach for Segmentation of Time Series and Pattern Discovery
    Tseng, Vincent S.
    Chen, Chun-Hao
    Huang, Pal-Chieh
    Hong, Tzung-Pei
    2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, : 1949 - +
  • [24] SHAPE-BASED TIME SERIES SIMILARITY MEASURE AND PATTERN DISCOVERY ALGORITHM
    Zeng Fanzi Qiu Zhengding Li Dongsheng Yue Jianhai(Institute of Information and Science
    Journal of Electronics(China), 2005, (02) : 142 - 148
  • [25] SHAPE-BASED TIME SERIES SIMILARITY MEASURE AND PATTERN DISCOVERY ALGORITHM
    Zeng Fanzi Qiu Zhengding Li Dongsheng Yue JianhaiInstitute of Information and Science Beijing Jiaotong University Beijing ChinaDongjian Hydropower Plant Hunan China
    JournalofElectronics, 2005, (02) : 142 - 148
  • [26] Time series pattern discovery by segmental Gaussian models
    Shuichiro, I
    Makoto, S
    Akihiko, N
    PRICAI 2004: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3157 : 975 - 976
  • [27] A new hybrid method for predicting univariate and multivariate time series based on pattern forecasting
    Castan-Lascorz, M. A.
    Jimenez-Herrera, P.
    Troncoso, A.
    Asencio-Cortes, G.
    INFORMATION SCIENCES, 2022, 586 : 611 - 627
  • [28] Pattern discovery of fuzzy time series for financial prediction
    Lee, CHL
    Liu, A
    Chen, WS
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (05) : 613 - 625
  • [29] A new hybrid method for predicting univariate and multivariate time series based on pattern forecasting
    Castán-Lascorz, M.A.
    Jiménez-Herrera, P.
    Troncoso, A.
    Asencio-Cortés, G.
    Information Sciences, 2022, 586 : 611 - 627
  • [30] Privacy preserving pattern discovery in distributed time series
    da Silva, Josenildo Costa
    Klusch, Matthias
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2, 2007, : 207 - 214