Exploring the Adaptability of Word Embeddings to Log Message Classification

被引:0
|
作者
Shehu, Yusufu [1 ]
Harper, Robert [1 ]
机构
[1] Moogsoft Ltd, 31-35 High St, Kingston Upon Thames, Surrey, England
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Minimizing the resolution time of service-impacting incidents is a fundamental objective of IT operations. Enriching the meta-data of the events and logs ingested by such systems using AI-based classifiers greatly increases the efficacy of features such as root cause analysis and workflow automation, and hence reduces incident remediation time. The use of word embeddings in text classification tasks is well-established, however, the general English corpora used to generate off-the-shelf embeddings lack the domain-specific lexicon required for accurate classification of event and log data. In the current contribution, we investigate multiple ways in which this deficiency can be addressed. In addition to augmenting the training-corpus with a domain-specific lexicon, we increase the granularity of our embedding using character n-gram decompositions and sub-word level representations. All implementations improved classification accuracy over the base case. Further, we explore the performance of a sequence classifier with embeddings of varying domain specificity. We observe that the performance of high-specificity models reduces as the volume of previously unseen words in the test data increases. We conclude that for a multi-input use case, and by leveraging sub-word level information, a high-specificity model can be outperformed by a model trained on a low-specificity corpus.
引用
收藏
页码:854 / 859
页数:6
相关论文
共 50 条
  • [1] Exploring Numeracy in Word Embeddings
    Naik, Aakanksha
    Ravichander, Abhilasha
    Rose, Carolyn
    Hovy, Eduard
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3374 - 3380
  • [2] Text Classification Using Word Embeddings
    Helaskar, Mukund N.
    Sonawane, Sheetal S.
    [J]. 2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [3] Classification and Clustering of Arguments with Contextualized Word Embeddings
    Reimers, Nils
    Schiller, Benjamin
    Beck, Tilman
    Daxenberger, Johannes
    Stab, Christian
    Gurevych, Iryna
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 567 - 578
  • [4] Debate Stance Classification Using Word Embeddings
    Konjengbam, Anand
    Ghosh, Subrata
    Kumar, Nagendra
    Singh, Manish
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2018), 2018, 11031 : 382 - 395
  • [5] Text classification with semantically enriched word embeddings
    Pittaras, N.
    Giannakopoulos, G.
    Papadakis, G.
    Karkaletsis, V
    [J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (04) : 391 - 425
  • [6] Using word embeddings in Twitter election classification
    Xiao Yang
    Craig Macdonald
    Iadh Ounis
    [J]. Information Retrieval Journal, 2018, 21 : 183 - 207
  • [7] Using word embeddings in Twitter election classification
    Yang, Xiao
    Macdonald, Craig
    Ounis, Iadh
    [J]. INFORMATION RETRIEVAL JOURNAL, 2018, 21 (2-3): : 183 - 207
  • [8] Exploring Implicit Semantic Constraints for Bilingual Word Embeddings
    Jinsong Su
    Zhenqiao Song
    Yaojie Lu
    Mu Xu
    Changxing Wu
    Yidong Chen
    [J]. Neural Processing Letters, 2018, 48 : 1073 - 1088
  • [9] Exploring Implicit Semantic Constraints for Bilingual Word Embeddings
    Su, Jinsong
    Song, Zhenqiao
    Lu, Yaojie
    Xu, Mu
    Wu, Changxing
    Chen, Yidong
    [J]. NEURAL PROCESSING LETTERS, 2018, 48 (02) : 1073 - 1088
  • [10] An approach to the use of word embeddings in an opinion classification task
    Enriquez, Fernando
    Troyano, Jose A.
    Lopez-Solaz, Tomas
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 66 : 1 - 6