OptIForest: Optimal Isolation Forest for Anomaly Detection

被引:0
|
作者
Xiang, Haolong [1 ]
Zhang, Xuyun [1 ]
Hu, Hongsheng [2 ]
Qi, Lianyong [3 ]
Dou, Wanchun [4 ]
Dras, Mark [1 ]
Beheshti, Amin [1 ]
Xu, Xiaolong [5 ]
机构
[1] Macquarie Univ, N Ryde, NSW 2109, Australia
[2] CSIROs Data61, Eveleigh, NSW, Australia
[3] Qufu Normal Univ, Qufu, Shandong, Peoples R China
[4] Nanjing Univ, Nanjing, Peoples R China
[5] Nanjing Univ Informat Sci & Technol, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anomaly detection plays an increasingly important role in various fields for critical tasks such as intrusion detection in cybersecurity, financial risk detection, and human health monitoring. A variety of anomaly detection methods have been proposed, and a category based on the isolation forest mechanism stands out due to its simplicity, effectiveness, and efficiency, e.g., iForest is often employed as a state-of-the-art detector for real deployment. While the majority of isolation forests use the binary structure, a framework LSHiForest has demonstrated that the multi-fork isolation tree structure can lead to better detection performance. However, there is no theoretical work answering the fundamentally and practically important question on the optimal tree structure for an isolation forest with respect to the branching factor. In this paper, we establish a theory on isolation efficiency to answer the question and determine the optimal branching factor for an isolation tree. Based on the theoretical underpinning, we design a practical optimal isolation forest OptIForest incorporating clustering based learning to hash which enables more information to be learned from data for better isolation quality. The rationale of our approach relies on a better bias-variance trade-off achieved by bias reduction in OptIForest. Extensive experiments on a series of benchmarking datasets for comparative and ablation studies demonstrate that our approach can efficiently and robustly achieve better detection performance in general than the state-of-the-arts including the deep learning based methods.
引用
收藏
页码:2379 / 2387
页数:9
相关论文
共 50 条
  • [1] Deep Optimal Isolation Forest with Genetic Algorithm for Anomaly Detection
    Xiang, Haolong
    Zhang, Xuyun
    Dras, Mark
    Beheshti, Amin
    Dou, Wanchun
    Xu, Xiaolong
    [J]. 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 678 - 687
  • [2] Generalized isolation forest for anomaly detection
    Lesouple, Julien
    Baudoin, Cedric
    Spigai, Marc
    Tourneret, Jean-Yves
    [J]. PATTERN RECOGNITION LETTERS, 2021, 149 : 109 - 119
  • [3] Anomaly Detection with Generalized Isolation Forest
    Downey, Brett E.
    Leung, Carson K.
    Pazdor, Adam G. M.
    Petrillo, Ryan A. L.
    Popov, Denys
    Schneider, Benjamin R.
    [J]. ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 2, AINA 2024, 2024, 200 : 356 - 368
  • [4] Deep Isolation Forest for Anomaly Detection
    Xu, Hongzuo
    Pang, Guansong
    Wang, Yijie
    Wang, Yongjun
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (12) : 12591 - 12604
  • [5] Hyperspectral Anomaly Detection With Kernel Isolation Forest
    Li, Shutao
    Zhang, Kunzhong
    Duan, Puhong
    Kang, Xudong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (01): : 319 - 329
  • [6] ISOLATION FOREST FOR ANOMALY DETECTION IN HYPERSPECTRAL IMAGES
    Zhang, Kunzhong
    Kang, Xudong
    Li, Shutao
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 437 - 440
  • [7] Distribution Forest: An Anomaly Detection Method Based on Isolation Forest
    Yao, Chengfei
    Ma, Xiaoqing
    Chen, Biao
    Zhao, Xiaosong
    Bai, Gang
    [J]. ADVANCED PARALLEL PROCESSING TECHNOLOGIES (APPT 2019), 2019, 11719 : 135 - 147
  • [8] Leveraging an Isolation Forest to Anomaly Detection and Data Clustering
    Yepmo, Veronne
    Smits, Gregory
    Lesot, Marie -Jeanne
    Pivert, Olivier
    [J]. DATA & KNOWLEDGE ENGINEERING, 2024, 151
  • [9] Anomaly Detection in Semiconductor Cleanroom Using Isolation Forest
    Jahan, Israt
    Alam, Md Morshed
    Ahmed, Md Faisal
    Jang, Yeong Min
    [J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 795 - 797
  • [10] Anomaly Detection in Streaming Data using Isolation Forest
    Kareem, Mohammed Shaker
    Muhammed, Lamia AbedNoor
    [J]. PROCEEDINGS 2024 SEVENTH INTERNATIONAL WOMEN IN DATA SCIENCE CONFERENCE AT PRINCE SULTAN UNIVERSITY, WIDS-PSU 2024, 2024, : 223 - 228