There and back again: Outlier detection between statistical reasoning and data mining algorithms

被引:88
|
作者
Zimek, Arthur [1 ]
Filzmoser, Peter [2 ]
机构
[1] Univ Southern Denmark, Dept Math & Comp Sci, Campusvej 55, DK-5230 Odense M, Denmark
[2] Vienna Univ Technol, Inst Stat & Math Methods Econ, Vienna, Austria
关键词
anomaly detection; outlier detection; outlier model; statistics and data mining; DISTANCE-BASED OUTLIERS; ANOMALY DETECTION; NOVELTY DETECTION; IDENTIFICATION; FRAMEWORK; EFFICIENT; LOCATION; REJECTION; SELECTION; EXPLORATION;
D O I
10.1002/widm.1280
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier detection has been a topic in statistics for centuries. Over mainly the last two decades, there has been also an increasing interest in the database and data mining community to develop scalable methods for outlier detection. Initially based on statistical reasoning, however, these methods soon lost the direct probabilistic interpretability of the derived outlier scores. Here, we detail from a joint point of view of data mining and statistics the roots and the path of development of statistical outlier detection and of database-related data mining methods for outlier detection. We discuss their inherent meaning, review approaches to again find a statistically meaningful interpretation of outlier scores, and sketch related current research topics. This article is categorized under: Algorithmic Development > Statistics Algorithmic Development > Scalable Statistical Methods Technologies > Machine Learning
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Outlier Detection Algorithms in Data Mining
    Xi, Jingke
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL I, PROCEEDINGS, 2008, : 94 - 97
  • [2] Outlier detection algorithms in data mining systems
    Petrovskiy, MI
    [J]. PROGRAMMING AND COMPUTER SOFTWARE, 2003, 29 (04) : 228 - 237
  • [3] Outlier Detection Algorithms in Data Mining Systems
    M. I. Petrovskiy
    [J]. Programming and Computer Software, 2003, 29 : 228 - 237
  • [4] Outlier detection algorithms in data mining systems
    Petrovskij, M.I.
    [J]. Programmirovanie, 2003, 29 (04):
  • [5] Outlier detection with data mining techniques and statistical methods
    Orellana, Marcos
    Cedillo, Priscila
    [J]. ENFOQUE UTE, 2020, 11 (01): : 56 - 67
  • [6] Constructing Three-Dimension Space Graph for Outlier Detection Algorithms in Data Mining
    ZHANG Jing 1
    2.Department of Electricity and Information Engineering
    [J]. Wuhan University Journal of Natural Sciences, 2004, (05) : 585 - 589
  • [7] A comparison among data mining algorithms for outlier detection using flow pattern experiments
    Vaghefi, M.
    Mahmoodi, K.
    Akbari, M.
    [J]. SCIENTIA IRANICA, 2018, 25 (02) : 590 - 605
  • [8] Outlier Detection: Applications and Techniques in Data Mining
    Bansal, Rashi
    Gaur, Nishant
    Singh, Shailendra Narayan
    [J]. 2016 6TH INTERNATIONAL CONFERENCE - CLOUD SYSTEM AND BIG DATA ENGINEERING (CONFLUENCE), 2016, : 373 - 377
  • [9] A Survey of Outlier Detection Algorithms for Data Streams
    Tamboli, Jinita
    Shukla, Madhu
    [J]. PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3535 - 3540
  • [10] A comparison of outlier detection algorithms for ITS data
    Chen, Shuyan
    Wang, Wei
    van Zuylen, Henk
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (02) : 1169 - 1178