PerfInsight: A Robust Clustering-Based Abnormal Behavior Detection System for Large-Scale Cloud

被引:9
|
作者
Zhang, Xiao [1 ]
Meng, Fan Jing [1 ]
Xu, Jingmin [1 ]
机构
[1] IBM Res China, Beijing, Peoples R China
关键词
cloud computing; anomaly detection; unsupervised clustering; large-scale cloud;
D O I
10.1109/CLOUD.2018.00130
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Anomalous behaviors of cloud services usually lead to performance degradation or even unplanned outages, which dramatically harms their Quality of Services. Performance monitoring and anomaly detection systems have been widely applied to mitigate these risks. However, huge volume of collected data, prevalence of trends and noises in data distribution, lack of labelled anomalies and unpredictability of various types of anomalies bring great challenges to existing anomaly detection systems in real world. Recently, unsupervised clustering-based anomaly detection approaches become promising solutions due to less dependency on labelled data and adaption to various types of anomalies. To achieve better quality with clustering-based anomaly detection approaches, huge amount of data normalization work is required. In this paper, we present a practical robust anomaly detection system for large-scale cloud called PerfInsight. First, it detects potential trends from these collected data and automatically transforms them to reduce their negative impact to clustering results. Then, an entropy-based feature selection of transformed metrics is designed to improve the detection efficiency. Finally, more robust clustering models can be trained and used based on these well transformed and selected features. Our evaluation results prove that PerfInsight could significantly reduce the cardinality of models.
引用
收藏
页码:896 / 899
页数:4
相关论文
共 50 条
  • [1] Robust formulations for clustering-based large-scale classification
    Saketha Nath Jagarlapudi
    Aharon Ben-Tal
    Chiranjib Bhattacharyya
    [J]. Optimization and Engineering, 2013, 14 : 225 - 250
  • [2] Robust formulations for clustering-based large-scale classification
    Jagarlapudi, Saketha Nath
    Ben-Tal, Aharon
    Bhattacharyya, Chiranjib
    [J]. OPTIMIZATION AND ENGINEERING, 2013, 14 (02) : 225 - 250
  • [3] A Clustering-Based Approach for Large-Scale Ontology Matching
    Algergawy, Alsayed
    Massmann, Sabine
    Rahm, Erhard
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2011, 6909 : 415 - 428
  • [4] Robust large-scale clustering based on correntropy
    Jin, Guodong
    Gao, Jing
    Tan, Lining
    [J]. PLOS ONE, 2022, 17 (11):
  • [5] Robust and Rapid Clustering of KPIs for Large-Scale Anomaly Detection
    Li, Zhihan
    Zhao, Youjian
    Liu, Rong
    Pei, Dan
    [J]. 2018 IEEE/ACM 26TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2018,
  • [6] Clustering-Based Coordinated Control of Large-Scale Wind Farm for Power System Frequency Support
    Ma, Shaokang
    Geng, Hua
    Yang, Geng
    Pal, Bikash C.
    [J]. IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2018, 9 (04) : 1555 - 1564
  • [7] Robust System Instance Clustering for Large-Scale Web Services
    Zhang, Shenglin
    Li, Dongwen
    Zhong, Zhenyu
    Zhu, Jun
    Liang, Minghan
    Luo, Jiexi
    Sun, Yongqian
    Su, Ya
    Xia, Sibo
    Hu, Zhongyou
    Zhang, Yuzhi
    Pei, Dan
    Sun, Jiyan
    Liu, Yinlong
    [J]. PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 1785 - 1796
  • [8] Clustering-based average state observer design for large-scale network systems
    Niazi, Muhammad Umar B.
    Cheng, Xiaodong
    Canudas-de-Wit, Carlos
    Scherpen, Jacquelien M. A.
    [J]. AUTOMATICA, 2023, 151
  • [9] ACURDION: An Adaptive Clustering-based Algorithm for Tracing Large-scale MPI Applications
    Bahmani, Amir
    Mueller, Frank
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 785 - 792
  • [10] Robust Large-Scale Machine Learning in the Cloud
    Rendle, Steffen
    Fetterly, Dennis
    Shekita, Eugene J.
    Su, Bor-yiing
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1125 - 1134