Practical Anomaly Detection over Multivariate Monitoring Metrics for Online Services

被引:2
|
作者
Liu, Jinyang [1 ]
Yang, Tianyi [1 ]
Chen, Zhuangbin [2 ]
Su, Yuxin [2 ]
Feng, Cong [3 ]
Yang, Zengyin [3 ]
Lyu, Michael R. [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Sun Yat Sen Univ, Sch Software Engn, Zhuhai, Peoples R China
[3] Fluawei Cloud Comp Technol Co Ltd, Comp & Networking Innovat Lab, Dongguan, Peoples R China
来源
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE | 2023年
基金
中国国家自然科学基金;
关键词
Anomaly Detection; Multivariate Monitoring Metrics; Software Reliability;
D O I
10.1109/ISSRE59848.2023.00045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As modern software systems continue to grow in terms of complexity and volume, anomaly detection on multivariate monitoring metrics, which profile systems' health status, becomes more and more critical and challenging. In particular, the dependency between different metrics and their historical patterns plays a critical role in pursuing prompt and accurate anomaly detection. Existing approaches fall short of industrial needs for being unable to capture such information efficiently. To fill this significant gap, in this paper, we propose CMAnomaly, an anomaly detection framework on multivariate monitoring metrics based on collaborative machine. The proposed collaborative machine is a mechanism to capture the pairwise interactions along with feature and temporal dimensions with linear time complexity. Cost-effective models can then be employed to leverage both the dependency between monitoring metrics and their historical patterns for anomaly detection. The proposed framework is extensively evaluated with both public data and industrial data collected from a large-scale online service system of Huawei Cloud. The experimental results demonstrate that compared with state-of-the-art baseline models, CMAnomaly achieves an average F1 score of 0.9494, outperforming baselines by 6.77% similar to 10.68%, and runs 10x similar to 20x faster. Furthermore, we also share our experience of deploying CMAnomaly in Huawei Cloud.
引用
收藏
页码:36 / 45
页数:10
相关论文
共 50 条
  • [31] PERFORMABILITY METRICS FOR ONLINE SERVICES IN EDUCATION
    Albeanu, Grigore
    Popentiu-Vladicescu, Florin
    QUALITY AND EFFICIENCY IN E-LEARNING, VOL 2, 2013, : 29 - 34
  • [32] Jump-Starting Multivariate Time Series Anomaly Detection for Online Service Systems
    Ma, Minghua
    Zhang, Shenglin
    Chen, Junjie
    Xu, Jun
    Li, Haozhe
    Lin, Yongliang
    Nie, Xiaohui
    Zhou, Bo
    Wang, Yong
    Pei, Dan
    PROCEEDINGS OF THE 2021 USENIX ANNUAL TECHNICAL CONFERENCE, 2021, : 413 - 426
  • [33] Online Anomaly Detection for Smartphone-Based Multivariate Behavioral Time Series Data
    Liu, Gang
    Onnela, Jukka-Pekka
    SENSORS, 2022, 22 (06)
  • [34] Unsupervised anomaly detection in multivariate time series with online evolving spiking neural networks
    Baessler, Dennis
    Kortus, Tobias
    Guehring, Gabriele
    MACHINE LEARNING, 2022, 111 (04) : 1377 - 1408
  • [35] Unsupervised Online Anomaly Detection on Multivariate Sensing Time Series Data for Smart Manufacturing
    Hsieh, Ruei-Jie
    Chou, Jerry
    Ho, Chih-Hsiang
    2019 IEEE 12TH CONFERENCE ON SERVICE-ORIENTED COMPUTING AND APPLICATIONS (SOCA 2019), 2019, : 90 - 97
  • [36] Unsupervised anomaly detection in multivariate time series with online evolving spiking neural networks
    Dennis Bäßler
    Tobias Kortus
    Gabriele Gühring
    Machine Learning, 2022, 111 : 1377 - 1408
  • [37] Anomaly detection over differential preserved privacy in online social networks
    Aljably, Randa
    Tian, Yuan
    Al-Rodhaan, Mznah
    Al-Dhelaan, Abdullah
    PLOS ONE, 2019, 14 (04):
  • [38] fKPISelect: Fault-Injection Based Automated KPI Selection for Practical Multivariate Anomaly Detection
    Zhang, Xingjian
    Zhao, Yinqin
    Liu, Chang
    Wang, Long
    Yang, Xin
    Hou, Yefei
    Lan, Zhongwen
    Hu, Xining
    Miao, Beibei
    Yang, Ming
    Jing, Xiangyi
    Li, Sijie
    2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 183 - 194
  • [39] Advanced methods of multivariate anomaly detection
    Schaum, A.
    2007 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2007, : 2088 - 2094
  • [40] Multivariate Anomaly Detection with Domain Clustering
    Boesel, Frederic
    Schlapfer, Livio
    Pozidis, Haris
    Gusat, Mitch
    PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON CLOUD COMPUTING, SOCC 2023, 2023, : 193 - 199