A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data

被引:544
|
作者
Goldstein, Markus [1 ]
Uchida, Seiichi [2 ]
机构
[1] Kyushu Univ, Ctr Coevolut Social Syst Innovat, Fukuoka 812, Japan
[2] Kyushu Univ, Dept Adv Informat Technol, Fukuoka 812, Japan
来源
PLOS ONE | 2016年 / 11卷 / 04期
基金
日本科学技术振兴机构;
关键词
NOVELTY DETECTION;
D O I
10.1371/journal.pone.0152173
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Anomaly detection is the process of identifying unexpected items or events in datasets, which differ from the norm. In contrast to standard classification tasks, anomaly detection is often applied on unlabeled data, taking only the internal structure of the dataset into account. This challenge is known as unsupervised anomaly detection and is addressed in many practical applications, for example in network intrusion detection, fraud detection as well as in the life science and medical domain. Dozens of algorithms have been proposed in this area, but unfortunately the research community still lacks a comparative universal evaluation as well as common publicly available datasets. These shortcomings are addressed in this study, where 19 different unsupervised anomaly detection algorithms are evaluated on 10 different datasets from multiple application domains. By publishing the source code and the datasets, this paper aims to be a new well-funded basis for unsupervised anomaly detection research. Additionally, this evaluation reveals the strengths and weaknesses of the different approaches for the first time. Besides the anomaly detection performance, computational effort, the impact of parameter settings as well as the global/local anomaly detection behavior is outlined. As a conclusion, we give an advise on algorithm selection for typical real-world tasks.
引用
收藏
页数:31
相关论文
共 50 条
  • [41] Unsupervised Graph Anomaly Detection Algorithms Implemented in Apache Spark
    Semenov, A.
    Mazeev, A.
    Doropheev, D.
    Yusubaliev, T.
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2018, 39 (09) : 1262 - 1269
  • [42] Unsupervised Multivariate Time Series Data Anomaly Detection in Industrial IoT: A Confidence Adversarial Autoencoder Network
    Shan, Jiahao
    Cai, Donghong
    Fang, Fang
    Khan, Zahid
    Fan, Pingzhi
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2024, 5 : 7752 - 7766
  • [43] UNSUPERVISED ANOMALY DETECTION FOR MULTIVARIATE TIME SERIES USING DIFFUSION MODEL
    Hu, Rongyao
    Yuan, Xinyu
    Qiao, Yan
    Zhang, BenChu
    Zhao, Pei
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 9606 - 9610
  • [44] HYPERSPECTRAL ANOMALY DETECTION WITH DATA SPHERING AND UNSUPERVISED TARGET DETECTION
    Chen, Shuhan
    Li, Xiaorun
    Zhao, Liaoying
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1975 - 1978
  • [45] Evaluation of the Performance of Unsupervised Learning Algorithms for Intrusion Detection in Unbalanced Data Environments
    Fernando, Gutierrez-Portela
    Florina, Almenares Mendoza
    Liliana, Calderon-Benavides
    IEEE ACCESS, 2024, 12 : 190134 - 190157
  • [46] Unsupervised Anomaly Detection Based on Data Augmentation and Mixing
    Ishida, Naoya
    Nagatsu, Yuki
    Hashimoto, Hideki
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 529 - 533
  • [47] Unsupervised detection of contextual anomaly in remotely sensed data
    Liu, Qi
    Klucik, Rudy
    Chen, Chao
    Grant, Glenn
    Gallaher, David
    Lv, Qin
    Shang, Li
    REMOTE SENSING OF ENVIRONMENT, 2017, 202 : 75 - 87
  • [48] Explainable unsupervised anomaly detection for healthcare insurance data
    De Meulemeester, Hannes
    De Smet, Frank
    van Dorst, Johan
    Derroitte, Elise
    De Moor, Bart
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [49] An outlier ensemble for unsupervised anomaly detection in honeypots data
    Boukela, Lynda
    Zhang, Gongxuan
    Bouzefrane, Samia
    Zhou, Junlong
    INTELLIGENT DATA ANALYSIS, 2020, 24 (04) : 743 - 758
  • [50] Unsupervised Anomaly Detection for Conveyor Temperature SCADA Data
    Wodecki, Jacek
    Stefaniak, Pawel
    Polak, Marta
    Zimroz, Radoslaw
    ADVANCES IN CONDITION MONITORING OF MACHINERY IN NON-STATIONARY OPERATIONS, CMMNO 2016, 2018, 9