Unsupervised Artificial Neural Networks for Outlier Detection in High-Dimensional Data

被引:2
|
作者
Popovic, Daniel [1 ]
Fouche, Edouard [1 ]
Boehm, Klemens [1 ]
机构
[1] Karlsruhe Inst Technol KIT, Karlsruhe, Germany
关键词
Unsupervised learning; Outlier detection; Neural networks; SELECTION;
D O I
10.1007/978-3-030-28730-6_1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Outlier detection is an important field in data mining. For high-dimensional data the task is particularly challenging because of the so-called "curse of dimensionality": The notion of neighborhood becomes meaningless, and points typically show their outlying behavior only in subspaces. As a result, traditional approaches are ineffective. Because of the lack of a ground truth in real-world data and of a priori knowledge about the characteristics of potential outliers, outlier detection should be considered an unsupervised learning problem. In this paper, we examine the usefulness of unsupervised artificial neural networks - autoencoders, self-organising maps and restricted Boltzmann machines - to detect outliers in high-dimensional data in a fully unsupervised way. Each of those approaches targets at learning an approximate representation of the data. We show that one can measure the "outlierness" of objects effectively, by measuring their deviation from the learned representation. Our experiments show that neural-based approaches outperform the current state of the art in terms of both runtime and accuracy.
引用
收藏
页码:3 / 19
页数:17
相关论文
共 50 条
  • [1] Outlier detection for high-dimensional data
    Ro, Kwangil
    Zou, Changliang
    Wang, Zhaojun
    Yin, Guosheng
    [J]. BIOMETRIKA, 2015, 102 (03) : 589 - 599
  • [2] Fast outlier detection for high-dimensional data of wireless sensor networks
    Qiao, Yan
    Cui, Xinhong
    Jin, Peng
    Zhang, Wu
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2020, 16 (10):
  • [3] Efficient Outlier Detection for High-Dimensional Data
    Liu, Huawen
    Li, Xuelong
    Li, Jiuyong
    Zhang, Shichao
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (12): : 2451 - 2461
  • [4] A geometric framework for outlier detection in high-dimensional data
    Herrmann, Moritz
    Pfisterer, Florian
    Scheipl, Fabian
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 13 (03)
  • [5] A Comparison of Outlier Detection Techniques for High-Dimensional Data
    Xu, Xiaodan
    Liu, Huawen
    Li, Li
    Yao, Minghai
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 652 - 662
  • [6] A Comparison of Outlier Detection Techniques for High-Dimensional Data
    Xiaodan Xu
    Huawen Liu
    Li Li
    Minghai Yao
    [J]. International Journal of Computational Intelligence Systems, 2018, 11 : 652 - 662
  • [7] ROBUST CLASSIFICATION OF HIGH-DIMENSIONAL DATA USING ARTIFICIAL NEURAL NETWORKS
    SMITH, DJ
    BAILEY, TC
    MUNFORD, AG
    [J]. STATISTICS AND COMPUTING, 1993, 3 (02) : 71 - 81
  • [8] Outlier ensembles: A robust method for damage detection and unsupervised feature extraction from high-dimensional data
    Bull, L. A.
    Worden, K.
    Fuentes, R.
    Manson, G.
    Cross, E. J.
    Dervilis, N.
    [J]. JOURNAL OF SOUND AND VIBRATION, 2019, 453 : 126 - 150
  • [9] Research on Outlier Detection for High-Dimensional Data Based on PPCLOF
    Chen, Chen
    Luo, Kaiwen
    Min, Lan
    Li, Shenglin
    [J]. JOURNAL OF WEB ENGINEERING, 2021, 20 (03): : 743 - 758
  • [10] Thresholding-based outlier detection for high-dimensional data
    Yang, Xiaona
    Wang, Zhaojun
    Zi, Xuemin
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (11) : 2170 - 2184