Combining normalizing flows with decision trees for interpretable unsupervised outlier detection

被引:0
|
作者
Papastefanopoulos, Vasilis [1 ]
Linardatos, Pantelis [1 ]
Kotsiantis, Sotiris [1 ]
机构
[1] Department of Mathematics, University of Patras, Patras,26504, Greece
关键词
D O I
10.1016/j.engappai.2024.109770
中图分类号
学科分类号
摘要
Outlier detection is critical for ensuring data integrity across various domains, from fraud detection in finance to anomaly identification in healthcare. Despite the importance of anomaly detection, most methods focus on performance, with interpretability remaining underexplored in unsupervised learning. Interpretability is essential in contexts where understanding why certain data points are classified as outliers is as important as the detection itself. This study introduces an interpretable approach to unsupervised outlier detection by combining normalizing flows and decision trees. Normalizing flows transform complex data distributions into simpler, tractable forms, allowing precise density estimation and the generation of pseudo-labels that differentiate inliers from outliers. These pseudo-labels are subsequently used to train a decision tree, offering both a structured decision-making process and interpretability in an unsupervised context, thereby addressing a key gap in the field. Our method was evaluated against 23 established outlier detection algorithms across 17 datasets using Precision, Recall, F1 Score, and Matthews Correlation Coefficient (MCC). The results showed that our approach ranked 4th in F1 Score, 6th in MCC, 3rd in Precision, and 19th in Recall. While it performed strongly on some datasets and less so on others, this variability is likely due to dataset-specific characteristics. Post-hoc statistical significance testing demonstrated that interpretability in unsupervised outlier detection can be achieved without significantly compromising performance, making it a valuable option for applications that require transparent and understandable anomaly detection. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [31] Evolving interpretable decision trees for reinforcement learning
    Costa, Vinicius G.
    Perez-Aracil, Jorge
    Salcedo-Sanz, Sancho
    Pedreira, Carlos E.
    ARTIFICIAL INTELLIGENCE, 2024, 327
  • [32] Interpretable Quantile Regression by Optimal Decision Trees
    Lemaire, Valentin
    Aglin, Gael
    Nijssen, Siegfried
    ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT II, IDA 2024, 2024, 14642 : 210 - 222
  • [33] Feature Learning for Interpretable, Performant Decision Trees
    Good, Jack H.
    Kovach, Torin
    Miller, Kyle
    Dubrawski, Artur
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [34] Using Decision Trees for Interpretable Supervised Clustering
    Kokash N.
    Makhnist L.
    SN Computer Science, 5 (2)
  • [35] Mixture of Decision Trees for Interpretable Machine Learning
    Brueggenjuergen, Simeon
    Schaaf, Nina
    Kerschke, Pascal
    Huber, Marco F.
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1175 - 1182
  • [36] Interpretable hierarchical clustering by constructing an unsupervised decision tree
    Basak, J
    Krishnapuram, R
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (01) : 121 - 132
  • [37] Multivariate functional outlier detection using the fast massive unsupervised outlier detection indices
    Ojo, Oluwasegun Taiwo
    Anta, Antonio Fernandez
    Genton, Marc G.
    Lillo, Rosa E.
    STAT, 2023, 12 (01):
  • [38] Generative adversarial nets for unsupervised outlier detection
    Du, Xusheng
    Chen, Jiaying
    Yu, Jiong
    Li, Shu
    Tan, Qiyin
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 236
  • [39] On normalization and algorithm selection for unsupervised outlier detection
    Kandanaarachchi, Sevvandi
    Munoz, Mario A.
    Hyndman, Rob J.
    Smith-Miles, Kate
    DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (02) : 309 - 354