Practical Anomaly Detection over Multivariate Monitoring Metrics for Online Services

被引:2
|
作者
Liu, Jinyang [1 ]
Yang, Tianyi [1 ]
Chen, Zhuangbin [2 ]
Su, Yuxin [2 ]
Feng, Cong [3 ]
Yang, Zengyin [3 ]
Lyu, Michael R. [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Sun Yat Sen Univ, Sch Software Engn, Zhuhai, Peoples R China
[3] Fluawei Cloud Comp Technol Co Ltd, Comp & Networking Innovat Lab, Dongguan, Peoples R China
来源
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE | 2023年
基金
中国国家自然科学基金;
关键词
Anomaly Detection; Multivariate Monitoring Metrics; Software Reliability;
D O I
10.1109/ISSRE59848.2023.00045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As modern software systems continue to grow in terms of complexity and volume, anomaly detection on multivariate monitoring metrics, which profile systems' health status, becomes more and more critical and challenging. In particular, the dependency between different metrics and their historical patterns plays a critical role in pursuing prompt and accurate anomaly detection. Existing approaches fall short of industrial needs for being unable to capture such information efficiently. To fill this significant gap, in this paper, we propose CMAnomaly, an anomaly detection framework on multivariate monitoring metrics based on collaborative machine. The proposed collaborative machine is a mechanism to capture the pairwise interactions along with feature and temporal dimensions with linear time complexity. Cost-effective models can then be employed to leverage both the dependency between monitoring metrics and their historical patterns for anomaly detection. The proposed framework is extensively evaluated with both public data and industrial data collected from a large-scale online service system of Huawei Cloud. The experimental results demonstrate that compared with state-of-the-art baseline models, CMAnomaly achieves an average F1 score of 0.9494, outperforming baselines by 6.77% similar to 10.68%, and runs 10x similar to 20x faster. Furthermore, we also share our experience of deploying CMAnomaly in Huawei Cloud.
引用
收藏
页码:36 / 45
页数:10
相关论文
共 50 条
  • [1] Online Conditional Anomaly Detection in Multivariate Data for Transformer Monitoring
    Catterson, Victoria M.
    McArthur, Stephen D. J.
    Moss, Graham
    IEEE TRANSACTIONS ON POWER DELIVERY, 2010, 25 (04) : 2556 - 2564
  • [2] Online Conditional Anomaly Detection in Multivariate Data for Transformer Monitoring
    Catterson, Victoria
    McArthur, Stephen
    Moss, Graham
    2011 IEEE POWER AND ENERGY SOCIETY GENERAL MEETING, 2011,
  • [3] ONLINE ANOMALY DETECTION IN MULTIVARIATE SETTINGS
    Mozaffari, Mahsa
    Yilmaz, Yasin
    2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
  • [4] Scalable and accurate online multivariate anomaly detection
    Salles, Rebecca
    Lange, Benoit
    Akbarinia, Reza
    Masseglia, Florent
    Ogasawara, Eduardo
    Pacitti, Esther
    INFORMATION SYSTEMS, 2025, 131
  • [5] Unsupervised Anomaly Event Detection for VNF Service Monitoring using Multivariate Online Arima
    Schmidt, Florian
    Suri-Payer, Florian
    Gulenko, Anton
    Wallschlaeger, Marcel
    Acker, Alexander
    Kao, Odej
    2018 16TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM 2018), 2018, : 278 - 283
  • [6] A multivariate online anomaly detection algorithm based on SVD updating
    Qian Y.-K.
    Chen M.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2010, 32 (10): : 2404 - 2409
  • [7] Novel semi-metrics for multivariate change point analysis and anomaly detection
    James, Nick
    Menzies, Max
    Azizi, Lamiae
    Chan, Jennifer
    PHYSICA D-NONLINEAR PHENOMENA, 2020, 412
  • [8] Practical Approach to Asynchronous Multivariate Time Series Anomaly Detection and Localization
    Abdulaal, Ahmed
    Liu, Zhuanghua
    Lancewicki, Tomer
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 2485 - 2494
  • [9] Practical Anomaly Detection in Internet Services: An ISP centric approach
    Feng, Alex Huang
    Francois, Pierre
    Fukuda, Kensuke
    Du, Wanting
    Graf, Thomas
    Lucente, Paolo
    Frenot, Stephane
    PROCEEDINGS OF 2024 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, NOMS 2024, 2024,
  • [10] ONLINE REACTIVE ANOMALY DETECTION OVER STREAM DATA
    Fu, Yan
    Zhou, Jun-Lin
    Wu, Yue
    2008 INTERNATIONAL CONFERENCE ON APPERCEIVING COMPUTING AND INTELLIGENCE ANALYSIS (ICACIA 2008), 2008, : 291 - 294