Unveiling Clusters of Events for Alert and Incident Management in Large-Scale Enterprise IT

被引:22
|
作者
Lin, Derek [1 ]
Raghu, Rashmi [1 ]
Ramamurthy, Vivek [1 ]
Yu, Jin [2 ]
Radhakrishnan, Regunathan [1 ]
Fernandez, Joseph [3 ]
机构
[1] Pivotal Software Inc, 3495 Deer Creek Rd, Palo Alto, CA 94304 USA
[2] Pivotal Software Inc, Melbourne, Vic, Australia
[3] Visa Inc, Foster City, CA USA
关键词
Hierarchical clustering; Connected Components; Graph cut; Complete Linkage; kd-tree; Non-Negative Matrix Factorization; Tickets Analysis; Alerts and Incidents management; PARTS;
D O I
10.1145/2623330.2623360
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large enterprise IT (Information Technology) infrastructure components generate large volumes of alerts and incident tickets. These are manually screened, but it is otherwise difficult to extract information automatically from them to gain insights in order to improve operational efficiency. We propose a framework to cluster alerts and incident tickets based on the text in them, using unsupervised machine learning. This would be a step towards eliminating manual classification of the alerts and incidents, which is very labor intense and costly. Our framework can handle the semi-structured text in alerts generated by IT infrastructure components such as storage devices, network devices, servers etc., as well as the unstructured text in incident tickets created manually by operations support personnel. After text pre-processing and application of appropriate distance metrics, we apply different graph-theoretic approaches to cluster the alerts and incident tickets, based on their semi-structured and unstructured text respectively. For automated interpretation and read-ability on semi-structured text clusters, we propose a method to visualize clusters that preserves the structure and human-readability of the text data as compared to traditional word clouds where the text structure is not preserved; for unstructured text clusters, we find a simple way to define prototypes of clusters for easy interpretation. This framework for clustering and visualization will enable enterprises to prioritize the issues in their IT infrastructure and improve the reliability and availability of their services.
引用
收藏
页码:1630 / 1639
页数:10
相关论文
共 50 条