A scalable architecture for online anomaly detection of WLCG batch jobs

被引:0
|
作者
Kuehn, E. [1 ]
Fischer, M. [1 ]
Giffels, M. [1 ]
Jung, C. [1 ]
Petzold, A. [1 ]
机构
[1] Karlsruhe Inst Technol, Steinbuch Ctr Comp, Hermann von Helmholtz Pl 1, D-76344 Eggenstein Leopoldshafen, Germany
关键词
D O I
10.1088/1742-6596/762/1/012002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
For data centres it is increasingly important to monitor the network usage, and learn from network usage patterns. Especially configuration issues or misbehaving batch jobs preventing a smooth operation need to be detected as early as possible. At the GridKa data and computing centre we therefore operate a tool BPNetMon for monitoring traffic data and characteristics of WLCG batch jobs and pilots locally on different worker nodes. On the one hand local information itself are not sufficient to detect anomalies for several reasons, e.g. the underlying job distribution on a single worker node might change or there might be a local misconfiguration. On the other hand a centralised anomaly detection approach does not scale regarding network communication as well as computational costs. We therefore propose a scalable architecture based on concepts of a super-peer network.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Clustering Evolving Batch System Jobs for Online Anomaly Detection
    Kuehn, Eileen
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1534 - 1535
  • [2] Analyzing data flows of WLCG jobs at batch job level
    Kuehn, Eileen
    Fischer, Max
    Giffels, Manuel
    Jung, Christopher
    Petzold, Andreas
    [J]. 16TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH (ACAT2014), 2015, 608
  • [3] Online and Scalable Unsupervised Network Anomaly Detection Method
    Dromard, Juliette
    Roudiere, Gilles
    Owezarski, Philippe
    [J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2017, 14 (01): : 34 - 47
  • [4] Isolation Mondrian Forest for Batch and Online Anomaly Detection
    Ma, Haoran
    Ghojogh, Benyamin
    Samad, Maria N.
    Zheng, Dongyu
    Crowley, Mark
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3051 - 3058
  • [5] Scalable Architecture for Anomaly Detection and Visualization in Power Generating Assets
    Jain, Paras
    Tailor, Chirag
    Ford, Sam
    Ding, Liexiao
    Phillips, Michael
    Liu, Fang
    Gebraeel, Nagi
    Chau, Duen Horng
    [J]. 2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 1078 - 1082
  • [6] Batch and online anomaly detection for scientific applications in a Kubernetes environment
    Hariri, Sahand
    Kind, Matias Carrasco
    [J]. PROCEEDINGS OF THE ACM WORKSHOP ON SCIENTIFIC CLOUD COMPUTING (SCIENCECLOUD'18), 2018,
  • [7] ADGAN: A Scalable GAN-based Architecture for Image Anomaly Detection
    Cheng, Haoqing
    Liu, Heng
    Gao, Fei
    Chen, Zhuo
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 987 - 993
  • [8] Online Anomaly Energy Consumption Detection Using Lambda Architecture
    Liu, Xiufeng
    Iftikhar, Nadeem
    Nielsen, Per Sieverts
    Heller, Alfred
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2016, 2016, 9829 : 193 - 209
  • [9] A scalable anomaly detection and mitigation architecture for legacy networks via an OpenFlow middlebox
    Giotis, Kostas
    Androulidakis, George
    Maglaris, Vasilis
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2016, 9 (13) : 1958 - 1970
  • [10] Scalable prediction-based online anomaly detection for smart meter data
    Liu, Xiufeng
    Nielsen, Per Sieverts
    [J]. INFORMATION SYSTEMS, 2018, 77 : 34 - 47