TraceModel: An Automatic Anomaly Detection and Root Cause Localization Framework for Microservice Systems

被引:3
|
作者
Cai, Yang [1 ]
Han, Biao [1 ]
Su, Jinshu [1 ]
Wang, Xiaoyan [2 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Hunan, Peoples R China
[2] Ibaraki Univ, Coll Engn, Hitachi, Ibaraki, Japan
基金
中国国家自然科学基金;
关键词
microservice; anomaly detection; root cause localization;
D O I
10.1109/MSN53354.2021.00081
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microservice system is a web application architecture that divides a single application into a suite of service nodes running as separate processes and communicating with lightweight message mechanisms. Although microservice can improve the abstraction, modularity and extensibility of web applications, it makes the anomaly detection and fault root cause localization more challenging for operational staff. To this end, in this paper, we first introduce the concept of service dependency graph (SDG) to depict the complex calling relationship between nodes and then develop an anomaly detection and root cause localization framework called TraceModel which consists of TraceVAE and ModelCoder. TraceVAE divides user requests into different request classes according to well-constructed trace and analysis them separately with variational autoencoder(VAE) to figures out abnormal requests. Based on the anomaly detection results of TraceVAE, ModelCoder localizes the root cause of unknown faults by comparing their fault features with the predefined fault models. By evaluating TraceModel on a real-world microservice system monitoring data set spanning 15 days, it is revealed that TraceModel can detect the anomaly and localize the fault root cause nodes within 110 seconds on average. Furthermore, it improves the root cause localization accuracy (to 97%) by 17.5% compared with the state-of-the-art root cause localization algorithm.
引用
收藏
页码:512 / 519
页数:8
相关论文
共 50 条
  • [1] On Anomaly Detection and Root Cause Analysis of Microservice Systems
    Guan, Zijie
    Lin, Jinjin
    Chen, Pengfei
    [J]. SERVICE-ORIENTED COMPUTING, ICSOC 2018, 2019, 11434 : 465 - 469
  • [2] ModelCoder: A Fault Model based Automatic Root Cause Localization Framework for Microservice Systems
    Cai, Yang
    Han, Biao
    Li, Jie
    Zhao, Na
    Su, Jinshu
    [J]. 2021 IEEE/ACM 29TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2021,
  • [3] Root-Cause Metric Location for Microservice Systems via Log Anomaly Detection
    Wang, Lingzhi
    Zhao, Nengwen
    Chen, Junjie
    Li, Pinnong
    Zhang, Wenchi
    Sui, Kaixin
    [J]. 2020 IEEE 13TH INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2020), 2020, : 142 - 150
  • [4] Practical Root Cause Localization for Microservice Systems via Trace Analysis
    Li, Zeyan
    Chen, Junjie
    Jiao, Rui
    Zhao, Nengwen
    Wang, Zhijun
    Zhang, Shuwei
    Wu, Yanjun
    Jiang, Long
    Yan, Leiqin
    Wang, Zikai
    Chen, Zhekang
    Zhang, Wenchi
    Nie, Xiaohui
    Sui, Kaixin
    Pei, Dan
    [J]. 2021 IEEE/ACM 29TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2021,
  • [5] MicroIRC: Instance-level Root Cause Localization for Microservice Systems
    Zhu, Yuhan
    Wang, Jian
    Li, Bing
    Zhao, Yuqi
    Zhang, Zekun
    Xiong, Yiming
    Chen, Shiping
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 216
  • [6] Localization of the Root Cause of the Anomaly
    Grusho, A. A.
    Grusho, N. A.
    Zabezhailo, M., I
    Timonina, E. E.
    [J]. AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2021, 55 (08) : 978 - 983
  • [7] Localization of the Root Cause of the Anomaly
    A. A. Grusho
    N. A. Grusho
    M. I. Zabezhailo
    E. E. Timonina
    [J]. Automatic Control and Computer Sciences, 2021, 55 : 978 - 983
  • [8] Graph-Based Root Cause Localization in Microservice Systems with Protection Mechanisms
    Tian, Wei
    Zhang, Haitao
    Yang, Neng
    Zhang, Yepeng
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (08) : 1211 - 1238
  • [9] Anomaly Detection and Root Cause Localization in Virtual Network Functions
    Sauvanaud, Carla
    Lazri, Kahina
    Kaaniche, Mohamed
    Kanoun, Karama
    [J]. 2016 IEEE 27TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2016, : 196 - 206
  • [10] MicroState: An Anomaly Localization Method in Heterogeneous Microservice Systems
    Yang, Jingjing
    Guo, Yuchun
    Chen, Yishuai
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 904 - 912