ImpactTracer: Root Cause Localization in Microservices Based on Fault Propagation Modeling

被引:0
|
作者
Xie, Ru [1 ,2 ]
Yang, Jing [1 ]
Li, Jingying [1 ,2 ]
Wang, Liming [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
关键词
microservice; cloud-native; dependability; fault modeling; root cause identification;
D O I
10.23919/DATE56975.2023.10137078
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microservice architecture is embraced by a growing number of enterprises due to the benefits of modularity and flexibility. However, being composed of numerous interdependent microservices, it is prone to cascading failures and afflicted by the arising problem of troubleshooting, which entails arduous efforts to identify the root cause node and ensure service availability. Previous works use call graph to characterize causality relationships of microservices but not completely or comprehensively, leading to an insufficient search of potential root cause nodes and consequently poor accuracy in culprit localization. In this paper, we propose ImpactTracer to address the above problems. ImpactTracer builds impact graph to provide a complete view of fault propagation in microservices and uses a novel backward tracing algorithm that exhaustively traverses the impact graph to identify the root cause node accurately. Extensive experiments on a real-world dataset demonstrate that ImpactTracer is effective in identifying the root cause node and outperforms the state-of-the-art methods by at least 72%, significantly facilitating troubleshooting in microservices.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Root cause analysis of actuator fault based on invertibility of interconnected system
    Zhang M.
    Li Z.-T.
    Cabassud M.
    Dahhou B.
    [J]. International Journal of Modelling, Identification and Control, 2017, 27 (04) : 256 - 270
  • [32] CAUSAL ALIGNMENT BASED FAULT ROOT CAUSES LOCALIZATION FOR WIRELESS NETWORK
    Liu, Yuequn
    Zhu, Wenhui
    Qiao, Jie
    Huang, Zhiyi
    Xiang, Yu
    Chen, Xuanzhi
    Chen, Wei
    Cai, Ruichu
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9311 - 9315
  • [33] Towards a Fault Taxonomy for Microservices-Based Applications
    Filho, Francisco Gutenberg S.
    Lelli, Valeria
    Santos, Ismayle de Sousa
    Andrade, Rossana M. C.
    [J]. 36TH BRAZILIAN SYMPOSIUM ON SOFTWARE ENGINEERING, SBES 2022, 2022, : 247 - 256
  • [34] A Fault Propagation Modeling Method Based on a Finite State Machine
    Chen, Xi
    Jiao, Jian
    [J]. 2017 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, 2017,
  • [35] A Fault Propagation Modeling and Analysis Method Based on Model Checking
    Chen, Lu
    Jiao, Jian
    Fan, Jiping
    Ren, Fuchun
    [J]. ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM 2016 PROCEEDINGS, 2016,
  • [36] An integrated methodology for fault detection, root cause diagnosis, and propagation pathway analysis in chemical process systems
    Amin, Md. Tanjin
    [J]. CLEANER ENGINEERING AND TECHNOLOGY, 2021, 4
  • [37] A novel fault localization method with fault propagation context analysis
    Ma, Peijun
    Wang, Yu
    Su, Xiaohong
    Wang, Tiantian
    [J]. 2013 THIRD INTERNATIONAL CONFERENCE ON INSTRUMENTATION & MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2013, : 1194 - 1199
  • [38] A novel dynamic bayesian network-based networked process monitoring approach for fault detection, propagation identification, and root cause diagnosis
    Yu, Jie
    Rashid, Mudassir M.
    [J]. AICHE JOURNAL, 2013, 59 (07) : 2348 - 2365
  • [39] HeMiRCA: Fine-Grained Root Cause Analysis for Microservices with Heterogeneous Data Sources
    Zhu, Zhouruixing
    Lee, Cheryl
    Tang, Xiaoying
    He, Pinjia
    [J]. ACM Transactions on Software Engineering and Methodology, 2024, 33 (08)
  • [40] Sleuth: A Trace-Based Root Cause Analysis System for Large-Scale Microservices with Graph Neural Networks
    Gan, Yu
    Liu, Guiyang
    Zhang, Xin
    Zhou, Qi
    Wu, Jiesheng
    Jiang, Jiangwei
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2023, VOL 4, 2023, : 324 - 337