Anomaly detection in large-scale data stream networks

被引:0
|
作者
Duc-Son Pham
Svetha Venkatesh
Mihai Lazarescu
Saha Budhaditya
机构
[1] Curtin University,Department of Computing
[2] Deakin University,Center for Pattern Recognition and Data Analytics (PRaDA)
来源
关键词
Anomaly detection; Random projection; Sensor network data; Spectral methods; Compressed sensing; Residual subspace analysis; Stream data processing;
D O I
暂无
中图分类号
学科分类号
摘要
This paper addresses the anomaly detection problem in large-scale data mining applications using residual subspace analysis. We are specifically concerned with situations where the full data cannot be practically obtained due to physical limitations such as low bandwidth, limited memory, storage, or computing power. Motivated by the recent compressed sensing (CS) theory, we suggest a framework wherein random projection can be used to obtained compressed data, addressing the scalability challenge. Our theoretical contribution shows that the spectral property of the CS data is approximately preserved under a such a projection and thus the performance of spectral-based methods for anomaly detection is almost equivalent to the case in which the raw data is completely available. Our second contribution is the construction of the framework to use this result and detect anomalies in the compressed data directly, thus circumventing the problems of data acquisition in large sensor networks. We have conducted extensive experiments to detect anomalies in network and surveillance applications on large datasets, including the benchmark PETS 2007 and 83 GB of real footage from three public train stations. Our results show that our proposed method is scalable, and importantly, its performance is comparable to conventional methods for anomaly detection when the complete data is available.
引用
收藏
页码:145 / 189
页数:44
相关论文
共 50 条
  • [31] DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection
    Huang, Xuanwen
    Yang, Yang
    Wang, Yang
    Wang, Chunping
    Zhang, Zhisheng
    Xu, Jiarong
    Chen, Lei
    Vazirgiannis, Michalis
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [32] DongTing: A large-scale dataset for anomaly detection of the Linux kernel
    Duan, Guoyun
    Fu, Yuanzhi
    Cai, Minjie
    Chen, Hao
    Sun, Jianhua
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 203
  • [33] Connecting the dots: anomaly and discontinuity detection in large-scale systems
    Haroon Malik
    Ian J. Davis
    Michael W. Godfrey
    Douglas Neuse
    Serge Manskovskii
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2016, 7 : 509 - 522
  • [34] A survey on data analysis on large-Scale wireless networks: online stream processing, trends, and challenges
    Medeiros, Dianne S., V
    Cunha Neto, Helio N.
    Lopez, Martin Andreoni
    Magalhaes, Luiz Claudio S.
    Fernandes, Natalia C.
    Vieira, Alex B.
    Silva, Edelberto F.
    Mattos, Diogo M. F.
    [J]. JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2020, 11 (01)
  • [35] A Hierarchical Framework for Smart Grid Anomaly Detection Using Large-Scale Smart Meter Data
    Moghaddass, Ramin
    Wang, Jianhui
    [J]. IEEE TRANSACTIONS ON SMART GRID, 2018, 9 (06) : 5820 - 5830
  • [36] Artificial neural networks for fault detection in large-scale data acquisition systems
    Jakubek, SM
    Strasser, TI
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, 17 (03) : 233 - 248
  • [37] Alovera: A Fast Stream Processing System for Large-Scale Data
    Zhang, Zhen'An
    Zhang, Dongjie
    Yu, Xiaopeng
    Wang, Jing
    He, Chunjiang
    Yuan, Pingpeng
    Jin, Hai
    [J]. 2013 8TH CHINAGRID ANNUAL CONFERENCE (CHINAGRID), 2013, : 74 - 79
  • [38] Fast Plagiarism Detection in Large-Scale Data
    Szmit, Radoslaw
    [J]. BEYOND DATABASES, ARCHITECTURES AND STRUCTURES: TOWARDS EFFICIENT SOLUTIONS FOR DATA ANALYSIS AND KNOWLEDGE REPRESENTATION, 2017, 716 : 329 - 343
  • [39] Constant Time EXPected Similarity Estimation for Large-Scale Anomaly Detection
    Schneider, Markus
    Ertel, Wolfgang
    Palm, Guenther
    [J]. ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 12 - 20
  • [40] Expected similarity estimation for large-scale batch and streaming anomaly detection
    Schneider, Markus
    Ertel, Wolfgang
    Ramos, Fabio
    [J]. MACHINE LEARNING, 2016, 105 (03) : 305 - 333