Tree-based Kendall's τ Maximization for Explainable Unsupervised Anomaly Detection

被引:0
|
作者
Kong, Lanfang [1 ]
Huet, Alexis [2 ]
Rossi, Dario [2 ]
Sozio, Mauro [1 ]
机构
[1] Inst Polytech Paris, Telecom Paris, Paris, France
[2] Huawei Technol Co Ltd, Paris, France
关键词
D O I
10.1109/ICDM58522.2023.00126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of building a regression tree with relatively small size, which maximizes the Kendall's tau coefficient between the anomaly scores of a source anomaly detection algorithm and those predicted by our regression tree. We consider a labeling function which assigns to each leaf the inverse of its size, thereby providing satisfactory explanations when comparing examples with different anomaly scores. We show that our approach can be used as a post-hoc model, i.e. to provide global explanations for an existing anomaly detection algorithm. Moreover, it can be used as an in-model approach, i.e. the source anomaly detection algorithm can be replaced all together. This is made possible by leveraging the off-the-shelf transparency of tree-based approaches and from the fact that the explanations provided by our approach do not rely on the source anomaly detection algorithm. The main technical challenge to tackle is the efficient computation of the Kendall's tau coefficients when determining the best split at each node of the regression tree. We show how such a coefficient can be computed incrementally, thereby making the running time of our algorithm almost linear (up to a logarithmic factor) in the size of the input. Our approach is completely unsupervised, which is appealing in the case when it is difficult to collect a large number of labeled examples. We complement our study with an extensive experimental evaluation against the state-of-the-art, showing the effectiveness of our approach.
引用
收藏
页码:1073 / 1078
页数:6
相关论文
共 50 条
  • [1] STREAMRHF: Tree-Based Unsupervised Anomaly Detection for Data Streams
    Nesic, Stefan
    Putina, Andrian
    Bahri, Maroua
    Huet, Alexis
    Navarro, Jose Manuel
    Rossi, Dario
    Sozio, Mauro
    2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2022,
  • [2] Enhanced Tree-Based Anomaly Detection
    Karczmarek, Pawel
    Galka, Lukasz
    Dolecki, Michal
    Pedrycz, Witold
    Czerwinski, Dariusz
    Kiersztyn, Adam
    Stegierski, Rafal
    2022 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2022,
  • [3] Tree-based algorithms for weakly supervised anomaly detection
    Finke, Thorben
    Hein, Marie
    Kasieczka, Gregor
    Kraemer, Michael
    Mueck, Alexander
    Prangchaikul, Parada
    Quadfasel, Tobias
    Shih, David
    Sommerhalder, Manuel
    PHYSICAL REVIEW D, 2024, 109 (03)
  • [4] Explainable unsupervised anomaly detection for healthcare insurance data
    Hannes De Meulemeester
    Frank De Smet
    Johan van Dorst
    Elise Derroitte
    Bart De Moor
    BMC Medical Informatics and Decision Making, 25 (1)
  • [5] Explainable Global Fairness Verification of Tree-Based Classifiers
    Calzavara, Stefano
    Cazzaro, Lorenzo
    Lucchese, Claudio
    Marcuzzi, Federico
    2023 IEEE CONFERENCE ON SECURE AND TRUSTWORTHY MACHINE LEARNING, SATML, 2023, : 1 - 17
  • [6] Explainable Unsupervised Multi-Sensor Industrial Anomaly Detection and Categorization
    Ameli, Mina
    Becker, Philipp Aaron
    Lankers, Katharina
    van Ackeren, Markus
    Baehring, Holger
    Maass, Wolfgang
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1468 - 1475
  • [7] TeKET: a Tree-Based Unsupervised Keyphrase Extraction Technique
    Rabby, Gollam
    Azad, Saiful
    Mahmud, Mufti
    Zamli, Kamal Z.
    Rahman, Mohammed Mostafizur
    COGNITIVE COMPUTATION, 2020, 12 (04) : 811 - 833
  • [8] TeKET: a Tree-Based Unsupervised Keyphrase Extraction Technique
    Gollam Rabby
    Saiful Azad
    Mufti Mahmud
    Kamal Z. Zamli
    Mohammed Mostafizur Rahman
    Cognitive Computation, 2020, 12 : 811 - 833
  • [9] Unsupervised discretization using tree-based density estimation
    Schmidberger, G
    Frank, E
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005, 2005, 3721 : 240 - 251
  • [10] Cardiac anomaly detection based on time and frequency domain features using tree-based classifiers
    Kropf, M.
    Hayn, D.
    Morris, D.
    Radhakrishnan, Aravind-Kumar
    Belyayskiy, E.
    Frydas, A.
    Pieske-Kraigher, E.
    Pieske, B.
    Schreier, G.
    PHYSIOLOGICAL MEASUREMENT, 2018, 39 (11)