STREAMRHF: Tree-Based Unsupervised Anomaly Detection for Data Streams

被引:0
|
作者
Nesic, Stefan [1 ]
Putina, Andrian [1 ]
Bahri, Maroua [2 ]
Huet, Alexis [3 ]
Navarro, Jose Manuel [3 ]
Rossi, Dario [3 ]
Sozio, Mauro [1 ]
机构
[1] Telecom Paris, Paris, France
[2] Inria Paris, Paris, France
[3] Huawei Technol Co Ltd, Paris, France
关键词
Data streams; Unsupervised learning; Anomaly detection; Random histogram;
D O I
10.1109/AICCSA56895.2022.10017876
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present STREAMRHF, an unsupervised anomaly detection algorithm for data streams. Our algorithm builds on some of the ideas of Random Histogram Forest (RHF) [1], a state-of-the-art algorithm for batch unsupervised anomaly detection. STREAMRHF constructs a forest of decision trees, where feature splits are determined according to the kurtosis score of every feature. It irrevocably assigns an anomaly score to data points, as soon as they arrive, by means of an incremental computation of its random trees and the kurtosis scores of the features. This allows efficient online scoring and concept drift detection altogether. Our approach is tree-based which boasts several appealing properties, such as explainability of the results [2]. We conduct an extensive experimental evaluation on multiple datasets from different real-world applications. Our evaluation shows that our streaming algorithm achieves comparable average precision to RHF while outperforming state-of-the-art streaming approaches for unsupervised anomaly detection with furthermore limited computational complexity.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Tree-based Kendall's τ Maximization for Explainable Unsupervised Anomaly Detection
    Kong, Lanfang
    Huet, Alexis
    Rossi, Dario
    Sozio, Mauro
    23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 1073 - 1078
  • [2] Enhanced Tree-Based Anomaly Detection
    Karczmarek, Pawel
    Galka, Lukasz
    Dolecki, Michal
    Pedrycz, Witold
    Czerwinski, Dariusz
    Kiersztyn, Adam
    Stegierski, Rafal
    2022 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2022,
  • [3] Tree-based algorithms for weakly supervised anomaly detection
    Finke, Thorben
    Hein, Marie
    Kasieczka, Gregor
    Kraemer, Michael
    Mueck, Alexander
    Prangchaikul, Parada
    Quadfasel, Tobias
    Shih, David
    Sommerhalder, Manuel
    PHYSICAL REVIEW D, 2024, 109 (03)
  • [4] Unsupervised Anomaly Detection for Network Data Streams in Industrial Control Systems
    Liu, Limengwei
    Hu, Modi
    Kang, Chaoqun
    Li, Xiaoyong
    INFORMATION, 2020, 11 (02)
  • [5] Anomaly-based error and intrusion detection in tabular data: No DNN outperforms tree-based classifiers
    Zoppi, Tommaso
    Gazzini, Stefano
    Ceccarelli, Andrea
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 160 : 951 - 965
  • [6] Decision tree-based Feature Ranking in Concept Drifting Data Streams
    Pereira Karax, Jean Antonio
    Malucelli, Andreia
    Barddal, Jean Paul
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 590 - 592
  • [7] Feature Scoring using Tree-Based Ensembles for Evolving Data Streams
    Gomes, Heitor Murilo
    de Mello, Rodrigo Fernandes
    Pfahringer, Bernhard
    Bifet, Albert
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 761 - 769
  • [8] Unsupervised Anomaly Detection Based on Data Augmentation and Mixing
    Ishida, Naoya
    Nagatsu, Yuki
    Hashimoto, Hideki
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 529 - 533
  • [9] Unsupervised Multi Scale Anomaly Detection in Streams of Events
    Plessis, Quentin
    Suzuki, Masaki
    Kitahara, Takeshi
    2016 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2016,
  • [10] Comparison of Tree-Based Methods for Multi-target Regression on Data Streams
    Osojnik, Aljaz
    Panov, Pance
    Dzeroski, Saso
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, 2016, 9607 : 17 - 31