Fast anomaly detection with locality-sensitive hashing and hyperparameter autotuning

被引:6
|
作者
Meira, Jorge [1 ,2 ]
Eiras-Franco, Carlos [1 ]
Bolon-Canedo, Veronica [1 ]
Marreiros, Goreti [2 ]
Alonso-Betanzos, Amparo [1 ]
机构
[1] Univ A Coruna, CITIC, La Coruna 15071, Spain
[2] Inst Engn Polytech Porto ISEP IPP, GECAD, Porto, Portugal
关键词
Anomaly detection; Unsupervised learning; AutoML; Scalability; Big data; OUTLIER DETECTION; NETWORK;
D O I
10.1016/j.ins.2022.06.035
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents LSHAD, an anomaly detection (AD) method based on Locality Sensitive Hashing (LSH), capable of dealing with large-scale datasets. The resulting algorithm is highly parallelizable and its implementation in Apache Spark further increases its ability to handle very large datasets. Moreover, the algorithm incorporates an automatic hyperparameter tuning mechanism so that users do not have to implement costly manual tuning. Our LSHAD method is novel as both hyperparameter automation and distributed properties are not usual in AD techniques. Our results for experiments with LSHAD across a variety of datasets point to state-of-the-art AD performance while handling much larger datasets than state-of-the-art alternatives. In addition, evaluation results for the tradeoff between AD performance and scalability show that our method offers significant advantages over competing methods. (C) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:1245 / 1264
页数:20
相关论文
共 50 条
  • [1] Learnable Locality-Sensitive Hashing for Video Anomaly Detection
    Lu, Yue
    Cao, Congqi
    Zhang, Yifan
    Zhang, Yanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 963 - 976
  • [2] Locality-Sensitive Hashing Optimizations for Fast Malware Clustering
    Oprisa, Ciprian
    Checiches, Marius
    Nandrean, Adrian
    2014 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2014, : 97 - +
  • [3] Fast Redescription Mining Using Locality-Sensitive Hashing
    Karjalainen, Maiju
    Galbrun, Esther
    Miettinen, Pauli
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT VII, ECML PKDD 2024, 2024, 14947 : 124 - 142
  • [4] In Defense of Locality-Sensitive Hashing
    Ding, Kun
    Huo, Chunlei
    Fan, Bin
    Xiang, Shiming
    Pan, Chunhong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (01) : 87 - 103
  • [5] Kernelized Locality-Sensitive Hashing
    Kulis, Brian
    Grauman, Kristen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (06) : 1092 - 1104
  • [6] Correlated Locality-Sensitive Hashing
    Pagh, Rasmus
    ALGORITHMS - ESA 2015, 2015, 9294
  • [7] A Machine Learning approach for anomaly detection on the Internet of Things based on Locality-Sensitive Hashing
    Hernandez-Jaimes, Mireya Lucia
    Martinez-Cruz, Alfonso
    Ramirez-Gutierrez, Kelseyalejandra
    INTEGRATION-THE VLSI JOURNAL, 2024, 96
  • [8] Fast Access for Star Catalog Based on Locality-Sensitive Hashing
    Zhu H.
    Liang B.
    Zhang T.
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2018, 36 (05): : 988 - 994
  • [9] Fast hierarchical clustering algorithm using locality-sensitive hashing
    Koga, H
    Ishibashi, T
    Watanabe, T
    DISCOVERY SCIENCE, PROCEEDINGS, 2004, 3245 : 114 - 128
  • [10] An Improved Algorithm for Locality-Sensitive Hashing
    Cen, Wei
    Miao, Kehua
    10TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2015), 2015, : 61 - 64