A Variable Markovian based Outlier Detection Method for Multi-dimensional Sequence over Data Stream

被引:0
|
作者
Yang, Dongsheng [1 ]
Wang, Yijie [1 ]
Li, Yongmou [1 ]
Ma, Xingkong [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Sci & Technol Parallel & Distributed Proc Lab, Changsha 410073, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
multi-dimensional sequence; data stream; outlier detection; feature selection; mutual information; variable Markovian; QUERIES;
D O I
10.1109/PDCAT.2016.48
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays sequence data tends to be multidimensional sequence over data stream, it has a large state space and arrives at unprecedented speed. It is a big challenge to design a multi-dimensional sequence outlier detection method to meet the accurate and high speed requirements. The traditional methods can't handle multi-dimensional sequence effectively as they have poor abilities for multi-dimensional sequence modeling, and can't detect outlier timely as they have high computational complexity. In this paper we propose a variable Markovian based outlier detection method for multi-dimensional sequence over data stream, VMOD, which consists of two algorithms: mutual information based feature selection algorithm (MIFS), variable Markovian based sequential analysis algorithm (VMSA). It uses MIFS algorithm to reduce the state space and redundant features, and uses VMSA algorithm to accelerate the outlier detection. Through VMOD method, we can improve the detection rate and detection speed. The MIFS algorithm uses mutual information as similarity measures and adopt clustering based strategy to select features, it can improve the abilities for sequence modeling through reducing the state space and redundant features, consequently, to improve the detection rate. The VMSA algorithm use random sample and index structure to accelerate the variable Markovian model construction and reduce the model complexity, consequently, to quicken the outlier detection. The experiments show that VMOD can detect outlier effectively, and reduce the detection time by at least 50% compared with the traditional methods.
引用
收藏
页码:183 / 188
页数:6
相关论文
共 50 条
  • [32] Statistical Change Detection for Multi-Dimensional Data
    Song, Xiuyao
    Wu, Mingxi
    Jermaine, Christopher
    Ranka, Sanjay
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 667 - 676
  • [33] A LoOP based outlier detection method for high dimensional fuzzy data set
    Jahromi, Alireza Fakharzadeh
    Zarei, Fateme
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 32 (01) : 241 - 248
  • [34] A multi-phase approach for classifying multi-dimensional sequence data
    Lee, Chang-Hwan
    INTELLIGENT DATA ANALYSIS, 2015, 19 (03) : 547 - 561
  • [35] Multi-dimensional range query over encrypted data
    Shi, Elaine
    Bethencourt, John
    Chan, T-H. Hubert
    Song, Dawn
    Perrig, Adrian
    2007 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, PROCEEDINGS, 2007, : 350 - +
  • [36] A Method Based on Tensor Decomposition for Missing Multi-dimensional Data Completion
    Chen, Jianke
    Chen, Pinghua
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2017, : 149 - 153
  • [37] Concept Drift Based Multi-dimensional Data Streams Sampling Method
    Lin, Ling
    Qi, Xiaolong
    Zhu, Zhirui
    Gao, Yang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I, 2019, 11439 : 331 - 342
  • [38] A Novel Weighted Frequent Pattern-Based Outlier Detection Method Applied to Data Stream
    Yuan, Gang
    Cai, Saihua
    Hao, Shangbo
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2019, : 503 - 510
  • [39] Discretization Method for the Range of Values of a Multi-Dimensional Random Variable
    A. V. Lapko
    V. A. Lapko
    Measurement Techniques, 2019, 62 : 16 - 22
  • [40] Discretization Method for the Range of Values of a Multi-Dimensional Random Variable
    Lapko, A. V.
    Lapko, V. A.
    MEASUREMENT TECHNIQUES, 2019, 62 (01) : 16 - 22