Continuous Monitoring of Distance-Based Outliers over Data Streams

被引:0
|
作者
Kontaki, Maria [1 ]
Gounaris, Anastasios [1 ]
Papadopoulos, Apostolos N. [1 ]
Tsichlas, Kostas [1 ]
Manolopoulos, Yannis [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki 54124, Greece
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anomaly detection is considered an important data mining task, aiming at the discovery of elements (also known as outliers) that show significant diversion from the expected case. More specifically, given a set of objects the problem is to return the suspicious objects that deviate significantly from the typical behavior. As in the case of clustering, the application of different criteria lead to different definitions for an outlier. In this work, we focus on distance-based outliers: an object x is an outlier if there are less than k objects lying at distance at most R from x. The problem offers significant challenges when a stream-based environment is considered, where data arrive continuously and outliers must be detected on-the-fly. There are a few research works studying the problem of continuous outlier detection. However, none of these proposals meets the requirements of modern stream-based applications for the following reasons: (i) they demand a significant storage overhead, (ii) their efficiency is limited and (iii) they lack flexibility. In this work, we propose new algorithms for continuous outlier monitoring in data streams, based on sliding windows. Our techniques are able to reduce the required storage overhead, run faster than previously proposed techniques and offer significant flexibility. Experiments performed on real-life as well as synthetic data sets verify our theoretical study.
引用
收藏
页码:135 / 146
页数:12
相关论文
共 50 条
  • [41] Rapid Parallel Detection of Distance-based Outliers in Time Series using MapReduce
    Ciolofan, Sorin N.
    Pop, Florin
    Mocanu, Mariana
    Cristea, Valentin
    [J]. CONTROL ENGINEERING AND APPLIED INFORMATICS, 2016, 18 (03): : 63 - 71
  • [42] Detecting outliers and influential points: an indirect classical Mahalanobis distance-based method
    Liu, Xuqing
    Gao, Feng
    Wu, Yandong
    Zhao, Zhiguo
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (11) : 2013 - 2033
  • [43] Distance-based Outliers Method for Detecting Disease Outbreaks using Social Media
    Dai, Xiangfeng
    Bikdash, Marwan
    [J]. SOUTHEASTCON 2016, 2016,
  • [44] DOLPHIN: An Efficient Algorithm for Mining Distance-Based Outliers in Very Large Datasets
    Angiulli, Fabrizio
    Fassetti, Fabio
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (01)
  • [45] Continuous distance-based skyline queries in road networks
    Huang, Yuan-Ko
    Chang, Chia-Heng
    Lee, Chiang
    [J]. INFORMATION SYSTEMS, 2012, 37 (07) : 611 - 633
  • [46] FROD: Fast and Robust Distance-Based Outlier Detection with Active-Inliers-Patterns in Data Streams
    Li, Zongren
    Wang, Yijie
    Zhao, Guohong
    Cheng, Li
    Ma, Xingkong
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 626 - 636
  • [47] An Effective Minimal Probing Approach With Micro-Cluster for Distance-Based Outlier Detection in Data Streams
    Bah, Mohamed Jaward
    Wang, Hongzhi
    Hammad, Mohamed
    Zeshan, Furkh
    Aljuaid, Hanan
    [J]. IEEE ACCESS, 2019, 7 : 154922 - 154934
  • [48] Weighted distance-based trees for ranking data
    Antonella Plaia
    Mariangela Sciandra
    [J]. Advances in Data Analysis and Classification, 2019, 13 : 427 - 444
  • [49] Distance-based tree models for ranking data
    Lee, Paul H.
    Yu, Philip L. H.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (06) : 1672 - 1682
  • [50] Continuous Outlier Monitoring on Uncertain Data Streams
    曹科研
    王国仁
    韩东红
    丁国辉
    王爱侠
    石凌旭
    [J]. Journal of Computer Science & Technology, 2014, 29 (03) : 436 - 448