KGroot: A knowledge graph-enhanced method for root cause analysis

被引:1
|
作者
Wang, Tingting [1 ]
Qi, Guilin [1 ]
Wu, Tianxing [1 ]
机构
[1] Southeast Univ, Sch Comp Sci & Engn, Nanjing, Peoples R China
关键词
Root cause analysis; Faults locating; Knowledge graph; GCN;
D O I
10.1016/j.eswa.2024.124679
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fault localization in online microservices is a challenging task due to the vast amount of monitoring data, diversity of types and events, and complex interdependencies among services and components. Fault events in services are propagative and can trigger a cascade of faults in a short period of time. In the industry, fault localization is typically conducted manually by experienced personnel. This reliance on experience is unreliable and lacks automation. Different modules present information barriers during manual localization, making it difficult to quickly align during urgent faults. This inefficiency lags stability assurance to minimize fault detection and repair time. Although actionable methods aimed to automate the process, the accuracy and efficiency are less than satisfactory. The precision of fault localization results is of paramount importance as it underpins engineers' trust in the diagnostic conclusions, which are derived from multiple perspectives and offer comprehensive insights. Therefore, a more reliable method is required to automatically identify the associative relationships among fault events and propagation paths. To achieve this, a knowledge graph-enhanced root cause analysis (KGroot) method is designed for efficient and effective diagnosis of recurring failures in complex microservices environments. As the first event-driven knowledge graph method, KGroot uses event knowledge and the correlation between events to perform root cause reasoning for Root Cause Analysis (RCA). A Fault Event Knowledge Graph (FEKG) is built based on historical data, an online graph is constructed in real-time when a failure event occurs, and the similarity between each event knowledge graph and online graph is compared using GCNs to pinpoint the fault type through a ranking strategy. Comprehensive experiments demonstrate that KGroot can locate the root cause with an accuracy of 93.5% top 3 potential causes in second-level. This performance matches the level of real-time fault diagnosis in the industrial environment and significantly surpasses state-of-the-art baselines in RCA in terms of effectiveness and efficiency. (KGroot is available at https://github.com/daixixiwang/KGroot).
引用
收藏
页数:11
相关论文
共 50 条
  • [31] GTPCR: Graph-Enhanced Transformer for Point Cloud Registration
    Chen, Kai
    Yao, Junfeng
    Li, Yuanhang
    Zhang, Han
    Shen, Huabo
    Qian, Quan
    Wu, Xing
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1304 - 1309
  • [32] Causality Enhanced Graph Representation Learning for Alert-Based Root Cause Analysis
    Yu, Zhaoyang
    Ouyang, Qianyu
    Pei, Changhua
    Wang, Xin
    Chen, Wenxiao
    Su, Liangfei
    Jiang, Huai
    Wang, Xuanrun
    Lie, Jianhui
    Pei, Dan
    2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024, 2024, : 77 - 86
  • [33] ChatTf: A Knowledge Graph-Enhanced Intelligent Q&A System for Mitigating Factuality Hallucinations in Traditional Folklore
    Xu, Jun
    Zhang, Hao
    Zhang, Haijing
    Lu, Jiawei
    Xiao, Gang
    IEEE ACCESS, 2024, 12 : 162638 - 162650
  • [34] KRAGEN: a knowledge graph-enhanced RAG framework for biomedical problem solving using large language models
    Matsumoto, Nicholas
    Moran, Jay
    Choi, Hyunjun
    Hernandez, Miguel E.
    Venkatesan, Mythreye
    Wang, Paul
    Moore, Jason H.
    BIOINFORMATICS, 2024, 40 (06)
  • [35] A graph-enhanced attention model for community detection in multiplex networks
    Wang, Bang
    Cai, Xiang
    Xu, Minghua
    Xiang, Wei
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 230
  • [36] A Causality Mining and Knowledge Graph Based Method of Root Cause Diagnosis for Performance Anomaly in Cloud Applications
    Qiu, Juan
    Du, Qingfeng
    Yin, Kanglin
    Zhang, Shuang-Li
    Qian, Chongshu
    APPLIED SCIENCES-BASEL, 2020, 10 (06):
  • [37] DrugFormer: Graph-Enhanced Language Model to Predict Drug Sensitivity
    Liu, Xiaona
    Wang, Qing
    Zhou, Minghao
    Wang, Yanfei
    Wang, Xuefeng
    Zhou, Xiaobo
    Song, Qianqian
    ADVANCED SCIENCE, 2024, 11 (40)
  • [38] Advancing Root Cause Analysis in Cloud-native System with Knowledge Graph Path Embedding Translation
    Li, Pengsheng
    Du, Qingfeng
    Zhao, Shengjie
    Fang, Pei
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 3134 - 3139
  • [39] Data-driven causal knowledge graph construction for root cause analysis in quality problem solving
    Xu, Zhaoguang
    Dang, Yanzhong
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2023, 61 (10) : 3227 - 3245
  • [40] Going Beyond Local: Global Graph-Enhanced Personalized News Recommendations
    Yang, Boming
    Liu, Dairui
    Suzumura, Toyotaro
    Dong, Ruihai
    Li, Irene
    PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 24 - 34