Combining normalizing flows with decision trees for interpretable unsupervised outlier detection

被引:0
|
作者
Papastefanopoulos, Vasilis [1 ]
Linardatos, Pantelis [1 ]
Kotsiantis, Sotiris [1 ]
机构
[1] Department of Mathematics, University of Patras, Patras,26504, Greece
关键词
D O I
10.1016/j.engappai.2024.109770
中图分类号
学科分类号
摘要
Outlier detection is critical for ensuring data integrity across various domains, from fraud detection in finance to anomaly identification in healthcare. Despite the importance of anomaly detection, most methods focus on performance, with interpretability remaining underexplored in unsupervised learning. Interpretability is essential in contexts where understanding why certain data points are classified as outliers is as important as the detection itself. This study introduces an interpretable approach to unsupervised outlier detection by combining normalizing flows and decision trees. Normalizing flows transform complex data distributions into simpler, tractable forms, allowing precise density estimation and the generation of pseudo-labels that differentiate inliers from outliers. These pseudo-labels are subsequently used to train a decision tree, offering both a structured decision-making process and interpretability in an unsupervised context, thereby addressing a key gap in the field. Our method was evaluated against 23 established outlier detection algorithms across 17 datasets using Precision, Recall, F1 Score, and Matthews Correlation Coefficient (MCC). The results showed that our approach ranked 4th in F1 Score, 6th in MCC, 3rd in Precision, and 19th in Recall. While it performed strongly on some datasets and less so on others, this variability is likely due to dataset-specific characteristics. Post-hoc statistical significance testing demonstrated that interpretability in unsupervised outlier detection can be achieved without significantly compromising performance, making it a valuable option for applications that require transparent and understandable anomaly detection. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [1] Interpretable fuzzy clustering using unsupervised fuzzy decision trees
    Jiao, Lianmeng
    Yang, Haoyu
    Liu, Zhun-ga
    Pan, Quan
    INFORMATION SCIENCES, 2022, 611 : 540 - 563
  • [2] Unsupervised anomaly detection in images using attentional normalizing flows
    Wu, Xingzhen
    Mao, Guojun
    Xing, Shuli
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [3] Unsupervised Universal Steganalysis Combining Image Retrieval and Outlier Detection
    Xu, Chen
    Zhang, Tao
    Hou, Xiaodan
    2016 IEEE INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2016, : 1047 - 1050
  • [4] Interpretable Single-dimension Outlier Detection (ISOD): An Unsupervised Outlier Detection Method Based on Quantiles and Skewness Coefficients
    Huang, Yuehua
    Liu, Wenfen
    Li, Song
    Guo, Ying
    Chen, Wen
    APPLIED SCIENCES-BASEL, 2024, 14 (01):
  • [5] Robust Variational Autoencoders and Normalizing Flows for Unsupervised Network Anomaly Detection
    Najari, Naji
    Berlemont, Samuel
    Lefebvre, Gregoire
    Duffner, Stefan
    Garcia, Christophe
    ADVANCED INFORMATION NETWORKING AND APPLICATIONS, AINA-2022, VOL 2, 2022, 450 : 281 - 292
  • [6] Sequential Outlier Detection Based on Incremental Decision Trees
    Gokcesu, Kaan
    Neyshabouri, Mohammadreza Mohaghegh
    Gokcesu, Hakan
    Kozat, Suleyman Serdar
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (04) : 993 - 1005
  • [7] Unsupervised video anomaly detection via normalizing flows with implicit latent features
    Cho, MyeongAh
    Kim, Taeoh
    Kim, Woo Jin
    Cho, Suhwan
    Lee, Sangyoun
    PATTERN RECOGNITION, 2022, 129
  • [8] The NFLikelihood: An unsupervised DNNLikelihood from normalizing flows
    Reyes-Gonzalez, Humberto
    Torre, Riccardo
    SCIPOST PHYSICS CORE, 2024, 7 (03):
  • [9] Harmonizing Flows: Unsupervised MR Harmonization Based on Normalizing Flows
    Beizaee, Farzad
    Desrosiers, Christian
    Lodygensky, Gregory A.
    Dolz, Jose
    INFORMATION PROCESSING IN MEDICAL IMAGING, IPMI 2023, 2023, 13939 : 347 - 359
  • [10] An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures
    Pasillas-Diaz, Jose Ramon
    Ratte, Sylvie
    ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2016, 329 : 61 - 77