Continuous Monitoring of Distance-Based Outliers over Data Streams

被引:0
|
作者
Kontaki, Maria [1 ]
Gounaris, Anastasios [1 ]
Papadopoulos, Apostolos N. [1 ]
Tsichlas, Kostas [1 ]
Manolopoulos, Yannis [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki 54124, Greece
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anomaly detection is considered an important data mining task, aiming at the discovery of elements (also known as outliers) that show significant diversion from the expected case. More specifically, given a set of objects the problem is to return the suspicious objects that deviate significantly from the typical behavior. As in the case of clustering, the application of different criteria lead to different definitions for an outlier. In this work, we focus on distance-based outliers: an object x is an outlier if there are less than k objects lying at distance at most R from x. The problem offers significant challenges when a stream-based environment is considered, where data arrive continuously and outliers must be detected on-the-fly. There are a few research works studying the problem of continuous outlier detection. However, none of these proposals meets the requirements of modern stream-based applications for the following reasons: (i) they demand a significant storage overhead, (ii) their efficiency is limited and (iii) they lack flexibility. In this work, we propose new algorithms for continuous outlier monitoring in data streams, based on sliding windows. Our techniques are able to reduce the required storage overhead, run faster than previously proposed techniques and offer significant flexibility. Experiments performed on real-life as well as synthetic data sets verify our theoretical study.
引用
收藏
页码:135 / 146
页数:12
相关论文
共 50 条
  • [1] Efficient and flexible algorithms for monitoring distance-based outliers over data streams
    Kontaki, Maria
    Gounaris, Anastasios
    Papadopoulos, Apostolos N.
    Tsichlas, Kostas
    Manolopoulos, Yannis
    [J]. INFORMATION SYSTEMS, 2016, 55 : 37 - 53
  • [2] Distance-based Outlier Detection in Data Streams
    Tran, Luan
    Fan, Liyue
    Shahabi, Cyrus
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (12): : 1089 - 1100
  • [3] Distance-based outliers in sequences
    Palshikar, GK
    [J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 547 - 552
  • [4] Reducing distance computations for distance-based outliers
    Angiulli, Fabrizio
    Basta, Stefano
    Lodi, Stefano
    Sartori, Claudio
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 147
  • [5] Continuous Monitoring of Distance-Based Range Queries
    Cheema, Muhammad Aamir
    Brankovic, Ljiljana
    Lin, Xuemin
    Zhang, Wenjie
    Wang, Wei
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (08) : 1182 - 1199
  • [6] Distance-based outliers: algorithms and applications
    Knorr, EM
    Ng, RT
    Tucakov, V
    [J]. VLDB JOURNAL, 2000, 8 (3-4): : 237 - 253
  • [7] Distance-based detection and prediction of outliers
    Angiulli, F
    Basta, S
    Pizzuti, C
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (02) : 145 - 160
  • [8] Improving prediction of distance-based outliers
    Angiulli, F
    Basta, S
    Pizzuti, C
    [J]. DISCOVERY SCIENCE, PROCEEDINGS, 2004, 3245 : 89 - 100
  • [9] Explainable Distance-Based Outlier Detection in Data Streams
    Toliopoulos, Theodoros
    Gounaris, Anastasios
    [J]. IEEE ACCESS, 2022, 10 : 47921 - 47936
  • [10] A Probabilistic Transformation of Distance-Based Outliers
    Muhr, David
    Affenzeller, Michael
    Kueng, Josef
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2023, 5 (03): : 782 - 802