Clustering-based real-time anomaly detection-A breakthrough in big data technologies

被引:46
|
作者
Habeeb, Riyaz Ahamed Ariyaluran [1 ]
Nasaruddin, Fariza [1 ]
Gani, Abdullah [6 ]
Amanullah, Mohamed Ahzam [3 ]
Hashem, Ibrahim Abaker Targio [2 ]
Ahmed, Ejaz [4 ]
Imran, Muhammad [5 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Informat Syst, Kuala Lumpur 50603, Malaysia
[2] Taylors Univ, Sch Comp & Informat Technol, Subang Jaya, Malaysia
[3] Telekom Res & Dev Sdn Bhd, Res & Innovat Dev, Cyberjaya, Malaysia
[4] Univ Malaya, Ctr Mobile Cloud Comp Res C4MCCR, Kuala Lumpur, Malaysia
[5] King Saud Univ, Coll Appl Comp Sci, Riyadh, Saudi Arabia
[6] Univ Malaya, Dept Comp Syst & Technol, Fac Comp Sci & Informat Technol, Kuala Lumpur, Malaysia
关键词
DETECTION SYSTEM; FRAMEWORK; INTERNET; MACHINE;
D O I
10.1002/ett.3647
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Off late, the ever increasing usage of a connected Internet-of-Things devices has consequently augmented the volume of real-time network data with high velocity. At the same time, threats on networks become inevitable; hence, identifying anomalies in real time network data has become crucial. To date, most of the existing anomaly detection approaches focus mainly on machine learning techniques for batch processing. Meanwhile, detection approaches which focus on the real-time analytics somehow deficient in its detection accuracy while consuming higher memory and longer execution time. As such, this paper proposes a novel framework which focuses on real-time anomaly detection based on big data technologies. In addition, this paper has also developed streaming sliding window local outlier factor coreset clustering algorithms (SSWLOFCC), which was then implemented into the framework. The proposed framework that comprises BroIDS, Flume, Kafka, Spark streaming, SparkMLlib, Matplot and HBase was evaluated to substantiate its efficacy, particularly in terms of accuracy, memory consumption, and execution time. The evaluation is done by performing critical comparative analysis using existing approaches, such as K-means, hierarchical density-based spatial clustering of applications with noise (HDBSCAN), isolation forest, spectral clustering and agglomerative clustering. Moreover, Adjusted Rand Index and memory profiler package were used for the evaluation of the proposed framework against the existing approaches. The outcome of the evaluation has substantially proven the efficacy of the proposed framework with a much higher accuracy rate of 96.51% when compared to other algorithms. Besides, the proposed framework also outperformed the existing algorithms in terms of lesser memory consumption and execution time. Ultimately the proposed solution enable analysts to precisely track and detect anomalies in real time.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] An ML Based Anomaly Detection System in real-time data streams
    Diaz Rivera, Javier Jose
    Khan, Talha Ahmed
    Akbar, Waleed
    Afaq, Muhammad
    Song, Wang-Cheol
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 1329 - 1334
  • [22] OpenK: An Elastic Data Cleansing System with A Clustering-based Data Anomaly Detection Approach
    Tran Khanh Dang
    Dinh Khuong Nguyen
    Luc Minh Tuan
    2021 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND APPLICATIONS (ACOMP 2021), 2021, : 120 - 127
  • [23] Online Anomaly Detection Leveraging Stream-Based Clustering and Real-Time Telemetry
    Putina, Andrian
    Rossi, Dario
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2021, 18 (01): : 839 - 854
  • [24] Unsupervised real-time anomaly detection for streaming data
    Ahmad, Subutai
    Lavin, Alexander
    Purdy, Scott
    Agha, Zuha
    NEUROCOMPUTING, 2017, 262 : 134 - 147
  • [25] Proposed Model for Real-Time Anomaly Detection in Big IoT Sensor Data for Smart City
    Hasani Z.
    Krrabaj S.
    Krasniqi M.
    International Journal of Interactive Mobile Technologies, 2024, 18 (03): : 32 - 44
  • [26] Clustering-Based Granular Representation of Time Series With Application to Collective Anomaly Detection
    Shi, Wen
    Karastoyanova, Dimka
    Ma, Yongsheng
    Huang, Yongming
    Zhang, Guobao
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [27] Real-time Big Data Technologies of Energy Internet Platform
    Wang Guilan
    Zhou Guoliang
    Zhao Hongshan
    Liu Hongyang
    2016 IEEE INTERNATIONAL CONFERENCE ON POWER SYSTEM TECHNOLOGY (POWERCON), 2016,
  • [28] A Clustering-Based Unsupervised Approach to Anomaly Intrusion Detection
    Nikolova, Evgeniya
    Jecheva, Veselina
    PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON COMPUTER, COMMUNICATION, CONTROL AND AUTOMATION, 2013, 68 : 202 - 205
  • [29] A Hybrid Unsupervised Clustering-Based Anomaly Detection Method
    Guo Pu
    Lijuan Wang
    Jun Shen
    Fang Dong
    TsinghuaScienceandTechnology, 2021, 26 (02) : 146 - 153
  • [30] Clustering-based label estimation for network anomaly detection
    Sunhee Baek
    Donghwoon Kwon
    Sang CSuh
    Hyunjoo Kim
    Ikkyun Kim
    Jinoh Kim
    Digital Communications and Networks, 2021, 7 (01) : 37 - 44