A Review of Unstructured Data Analysis and Parsing Methods

被引:0
|
作者
Jain, Shubham [1 ]
de Buitleir, Amy [2 ]
Fallon, Enda [1 ]
机构
[1] Athlone Inst Technol, Software Res Inst, Athlone, Ireland
[2] Ericsson, Network Management Lab, Athlone, Ireland
关键词
Data Mining; Information Extraction; Similarity; NLP; Knowledge base;
D O I
10.1109/esci48226.2020.9167588
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Computer applications generate an enormous amount of data every day through their logs, system-generated files or other reports. This generated data depicts the state of the running system and contains abundant information that can be used for system diagnostics and monitoring. Network monitoring systems produce a wide variety of unstructured information, so there is a need for an automated way to extract the relevant data, which currently requires multitude of custom parsers. Developing and testing custom parsers can be time-consuming. Instead, data can be automatically processed and parsed into a machine-readable format, building a generic model for standard or vendor-specific data, and generating insights for analytics, anomaly detection, intrusion detection, node failures and various other applications. This paper reviews some existing approaches for unstructured data mining and parsing and discusses the challenges in information extraction, creation of knowledge bases and presents a generic framework for automatic parsing.
引用
收藏
页码:164 / 169
页数:6
相关论文
共 50 条
  • [1] Analysis and Parsing of Unstructured Cyber-Security Incident Data
    Ochoa, Armando J.
    Finlayson, Mark A.
    [J]. PROCEEDINGS OF THE 2019 CONFERENCE ON SECURITY AND PRIVACY IN WIRELESS AND MOBILE NETWORKS (WISEC '19), 2019, : 345 - 346
  • [2] An Extensible Parsing Pipeline for Unstructured Data Processing
    Jain, Shubham
    de Buitleir, Amy
    Fallon, Enda
    [J]. 2021 23RD INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT 2021): ON-LINE SECURITY IN PANDEMIC ERA, 2021, : 312 - 318
  • [3] An Extensible Parsing Pipeline for Unstructured Data Processing
    Jain, Shubham
    de Buitleir, Amy
    Fallon, Enda
    [J]. 2022 24TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ARITIFLCIAL INTELLIGENCE TECHNOLOGIES TOWARD CYBERSECURITY, 2022, : 312 - +
  • [4] Unsupervised Noise Detection in Unstructured data for Automatic Parsing
    Jain, Shubham
    de Buitleir, Amy
    Fallon, Enda
    [J]. 2020 16TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2020,
  • [5] A Review of Methods for Processing Unstructured Data in the Assessment of Mining Personnel
    Barbara, Anna
    Pimonov, Alexander
    Sluder, Lyubov
    [J]. VTH INTERNATIONAL INNOVATIVE MINING SYMPOSIUM, 2020, 174
  • [6] A Framework for Adaptive Deep Reinforcement Semantic Parsing of Unstructured Data
    Jain, Shubham
    de Buitleir, Amy
    Fallon, Enda
    [J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1055 - 1060
  • [7] A scoping review of preprocessing methods for unstructured text data to assess data quality
    Nesca, Marcello
    Katz, Alan
    Leung, Carson K.
    Lix, Lisa M.
    [J]. INTERNATIONAL JOURNAL OF POPULATION DATA SCIENCE (IJPDS), 2022, 7 (01):
  • [8] Review of Text Neural Semantic Parsing Methods
    Shen, Lingyun
    Le, Xiaoqiu
    [J]. Data Analysis and Knowledge Discovery, 2023, 7 (12) : 1 - 21
  • [9] Accurate data reconstruction methods for unstructured grid
    Zhang, Sijun
    [J]. Proceedings of the ASME Fluids Engineering Division Summer Conference - 2005, Vol 1, Pts A and B, 2005, : 421 - 428
  • [10] Sentiment Analysis On Unstructured Review
    Nithya, R.
    Maheswari, D.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING APPLICATIONS (ICICA 2014), 2014, : 367 - 371