Leveraging an Isolation Forest to Anomaly Detection and Data Clustering

被引:1
|
作者
Yepmo, Veronne [1 ]
Smits, Gregory [2 ]
Lesot, Marie -Jeanne [3 ]
Pivert, Olivier [1 ]
机构
[1] Univ Rennes, IRISA, Lannion, France
[2] Lab STICC, IMT Atlantique, Brest, France
[3] Sorbonne Univ, LIP6, Paris, France
关键词
Anomaly/outlier detection; Isolation forest; Clustering; FUZZY; ALGORITHM; NOISE;
D O I
10.1016/j.datak.2024.102302
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding why some points in a data set are considered as anomalies cannot be done without taking into account the structure of the regular points. Whereas many machine learning methods are dedicated to the identification of anomalies on one side, or to the identification of the data inner -structure on the other side, a solution is introduced to answers these two tasks using a same data model, a variant of an isolation forest. The initial algorithm to construct an isolation forest is indeed revisited to preserve the data inner structure without affecting the efficiency of the outlier detection. Experiments conducted both on synthetic and real -world data sets show that, in addition to improving the detection of abnormal data points, the proposed variant of isolation forest allows for a reconstruction of the subspaces of high density. Therefore, the former can serve as a basis for a unified approach to detect global and local anomalies, which is a necessary condition to then provide users with informative descriptions of the data.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Integrated Clustering and Anomaly Detection (INCAD) for Streaming Data
    Guggilam, Sreelekha
    Zaidi, Syed Mohammed Arshad
    Chandola, Varun
    Patra, Abani K.
    COMPUTATIONAL SCIENCE - ICCS 2019, PT IV, 2019, 11539 : 45 - 59
  • [32] Anomaly detection model based on data stream clustering
    Yin, Chunyong
    Zhang, Sun
    Yin, Zhichao
    Wang, Jin
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 1): : 1729 - 1738
  • [33] Anomaly detection model based on data stream clustering
    Chunyong Yin
    Sun Zhang
    Zhichao Yin
    Jin Wang
    Cluster Computing, 2019, 22 : 1729 - 1738
  • [34] Anomaly intrusion detection based on clustering a data stream
    Oh, Sang-Hyun
    Kang, Jin-Suk
    Bytin, Yung-Cheol
    Jeong, Taikyeong T.
    Lee, Won-Suk
    INFORMATION SECURITY, PROCEEDINGS, 2006, 4176 : 415 - 426
  • [35] Anomaly Detection using Data Clustering and Neural Networks
    Qiu, Hai
    Eklund, Neil
    Hu, Xiao
    Yan, Weizhong
    Iyer, Naresh
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3627 - 3633
  • [36] An incremental clustering method for anomaly detection in flight data
    Zhao, Weizun
    Li, Lishuai
    Alam, Sameer
    Wang, Yanjun
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 132
  • [37] An anomaly detection approach based on the combination of LSTM autoencoder and isolation forest for multivariate time series data
    Phuong Hanh Tran
    Heuchenne, Cedric
    Thomassey, Sebastien
    DEVELOPMENTS OF ARTIFICIAL INTELLIGENCE TECHNOLOGIES IN COMPUTATION AND ROBOTICS, 2020, 12 : 589 - 596
  • [38] Real-Time Synchrophasor Data Anomaly Detection and Classification Using Isolation Forest, KMeans, and LoOP
    Khaledian, Ehdieh
    Pandey, Shikhar
    Kundu, Pratim
    Srivastava, Anurag K.
    IEEE TRANSACTIONS ON SMART GRID, 2021, 12 (03) : 2378 - 2388
  • [39] Anomaly Data Detection of Rolling Element Bearings Vibration Signal Based on Parameter Optimization Isolation Forest
    Wang, Haiming
    Li, Qiang
    Liu, Yongqiang
    Yang, Shaopu
    MACHINES, 2022, 10 (06)
  • [40] Hydrological Time Series Anomaly Pattern Detection based on Isolation Forest
    Qin, Yu
    Lou, YuanSheng
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 1706 - 1710