Combining normalizing flows with decision trees for interpretable unsupervised outlier detection

被引:0
|
作者
Papastefanopoulos, Vasilis [1 ]
Linardatos, Pantelis [1 ]
Kotsiantis, Sotiris [1 ]
机构
[1] Department of Mathematics, University of Patras, Patras,26504, Greece
关键词
D O I
10.1016/j.engappai.2024.109770
中图分类号
学科分类号
摘要
Outlier detection is critical for ensuring data integrity across various domains, from fraud detection in finance to anomaly identification in healthcare. Despite the importance of anomaly detection, most methods focus on performance, with interpretability remaining underexplored in unsupervised learning. Interpretability is essential in contexts where understanding why certain data points are classified as outliers is as important as the detection itself. This study introduces an interpretable approach to unsupervised outlier detection by combining normalizing flows and decision trees. Normalizing flows transform complex data distributions into simpler, tractable forms, allowing precise density estimation and the generation of pseudo-labels that differentiate inliers from outliers. These pseudo-labels are subsequently used to train a decision tree, offering both a structured decision-making process and interpretability in an unsupervised context, thereby addressing a key gap in the field. Our method was evaluated against 23 established outlier detection algorithms across 17 datasets using Precision, Recall, F1 Score, and Matthews Correlation Coefficient (MCC). The results showed that our approach ranked 4th in F1 Score, 6th in MCC, 3rd in Precision, and 19th in Recall. While it performed strongly on some datasets and less so on others, this variability is likely due to dataset-specific characteristics. Post-hoc statistical significance testing demonstrated that interpretability in unsupervised outlier detection can be achieved without significantly compromising performance, making it a valuable option for applications that require transparent and understandable anomaly detection. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [21] Unsupervised outlier detection in multidimensional data
    Atiq ur Rehman
    Samir Brahim Belhaouari
    Journal of Big Data, 8
  • [22] A new unsupervised outlier detection method
    Zheng, Lina
    Chen, Lijun
    Wang, Yini
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (01) : 1713 - 1734
  • [23] On the Internal Evaluation of Unsupervised Outlier Detection
    Marques, Henrique O.
    Campello, Ricardo J. G. B.
    Zimek, Arthur
    Sander, Jorg
    PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2015,
  • [24] Internal Evaluation of Unsupervised Outlier Detection
    Marques, Henrique O.
    Campello, Ricardo J. G. B.
    Sander, Jorg
    Zimek, Arthur
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (04)
  • [25] RDPOD: an unsupervised approach for outlier detection
    Abhaya, Abhaya
    Patra, Bidyut Kr
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (02): : 1065 - 1077
  • [26] Unsupervised outlier detection in multidimensional data
    Ur Rehman, Atiq
    Belhaouari, Samir Brahim
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [27] Bagged Subspaces for Unsupervised Outlier Detection
    Pasillas-Diaz, Jose Ramon
    Ratte, Sylvie
    COMPUTATIONAL INTELLIGENCE, 2017, 33 (03) : 507 - 523
  • [28] Intrusion detection combining multiple decision trees by fuzzy rogic
    Tian, JF
    Fu, Y
    Xu, Y
    Wang, JL
    PDCAT 2005: SIXTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2005, : 256 - 258
  • [29] Quantum normalizing flows for anomaly detection
    Rosenhahn, Bodo
    Hirche, Christoph
    Physical Review A, 2024, 110 (02)
  • [30] Optimal Decision Trees For Interpretable Clustering with Constraints
    Shati, Pouya
    Cohen, Eldan
    McIlraith, Sheila
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2022 - 2030