Workflow-Aware Automatic Fault Diagnosis for Microservice-Based Applications With Statistics

被引:31
|
作者
Wang, Tao [1 ]
Zhang, Wenbo [1 ]
Xu, Jiwei [2 ]
Gu, Zeyu [3 ]
机构
[1] Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China
[2] Univ Coll Dublin, Sch Comp Sci, Dublin D02 PN40 4, Ireland
[3] Xia Mobile Software Co Ltd, Xia Internet Dept, Beijing 100085, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划; 北京市自然科学基金;
关键词
Fault diagnosis; Time factors; Computer architecture; Software systems; Internet; workflow; microservice; execution traces; statistics; ANOMALY DETECTION; ONLINE;
D O I
10.1109/TNSM.2020.3022028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microservice architectures bring many benefits, e.g., faster delivery, improved scalability, and greater autonomy, so they are widely adopted to develop and operate Internet-based applications. How to effectively diagnose the faults of applications with lots of dynamic microservices has become a key to guarantee applications' performance and reliability. As a microservice performs various behaviors in different workflows of processing requests, existing approaches often cannot accurately locate the root cause of an application with interactive microservices in a dynamic deployment environment. We propose a workflow-aware automatic fault diagnosis approach for microservice-based applications with statistics. We characterize traces across microservices with calling trees, and then learn trace patterns as baselines. For the faults affecting the workflows of processing requests, we estimate the workflows' anomaly degrees, and then locate the microservices causing anomalies by comparing the difference between current traces and learned baselines with tree edit distance. For performance anomalies causing significantly increased response time, we employ principal component analysis to extract suspicious microservices with large fluctuation in response time. Finally, we evaluate our approach on three typical microservice-based applications with a series of experiments. The results show that our approach can accurately locate the microservices causing anomalies.
引用
收藏
页码:2350 / 2363
页数:14
相关论文
共 50 条
  • [1] Performance Modeling and Workflow Scheduling of Microservice-Based Applications in Clouds
    Bao, Liang
    Wu, Chase
    Bu, Xiaoxuan
    Ren, Nana
    Shen, Mengqing
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (09) : 2101 - 2116
  • [2] Topology-Aware Continuous Experimentation in Microservice-Based Applications
    Schermann, Gerald
    Oliveira, Fabio
    Wittern, Erik
    Leitner, Philipp
    SERVICE-ORIENTED COMPUTING (ICSOC 2020), 2020, 12571 : 19 - 35
  • [3] PBScaler: A Bottleneck-Aware Autoscaling Framework for Microservice-Based Applications
    Xie, Shuaiyu
    Wang, Jian
    Li, Bing
    Zhang, Zekun
    Li, Duantengchuan
    Hung, Patrick C. K.
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (02) : 604 - 616
  • [4] Quality of Service-aware matchmaking for adaptive microservice-based applications
    Stefanic, Polona
    Kochovski, Petar
    Rana, Omer F.
    Stankovski, Vlado
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (19):
  • [5] An Extensible Fault Tolerance Testing Framework for Microservice-based Cloud Applications
    Wu, Na
    Zuo, Decheng
    Zhang, Zhan
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING (ICCIP 2018), 2018, : 38 - 42
  • [6] Cost-Efficient Fault-Tolerant Workflow Scheduling for Deadline-Constrained Microservice-Based Applications in Clouds
    Li, Zengpeng
    Yu, Huiqun
    Fan, Guisheng
    Zhang, Jiayin
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2023, 20 (03): : 3220 - 3232
  • [7] Transparent Tracing of Microservice-based Applications
    Santana, Matheus
    Sampaio, Adalberto, Jr.
    Andrade, Marcos
    Rosa, Nelson S.
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 1252 - 1259
  • [8] Investigation of Microservice-Based Workflow Management Solutions for Industrial Automation
    Represa, Jaime Garcia
    Larrinaga, Felix
    Varga, Pal
    Ochoa, William
    Perez, Alain
    Kozma, Daniel
    Delsing, Jerker
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [9] IRENE: Interference and High Availability Aware Microservice-based Applications Placement for Edge Computing
    Souza, Paulo
    Nascimento, Joao
    Boeira, Conrado
    Vieira, Angelo
    Rubin, Felipe
    Reis, Romulo
    Rossi, Fabio
    Ferreto, Tiago
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE (CLOSER), 2020, : 490 - 497
  • [10] An Advanced DevOps Environment for Microservice-based Applications
    Throner, Stefan
    Huetter, Heiko
    Saenger, Niklas
    Schneider, Michael
    Hanselmann, Simon
    Petrovic, Patrick
    Abeck, Sebastian
    2021 15TH IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE 2021), 2021, : 134 - 143