On the provenance extraction techniques from large scale log files

被引:2
|
作者
Tufek, Alper [1 ]
Aktas, Mehmet S. [1 ]
机构
[1] Yildiz Tech Univ, Comp Engn Dept, Istanbul, Turkey
来源
关键词
machine learning-based provenance extraction; numerical weather prediction models; provenance; provenance analysis;
D O I
10.1002/cpe.6559
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Numerical weather prediction (NWP) models are the most important instruments to predict future weather. Provenance information is of central importance for detecting unexpected events that may develop during the long course of model execution. Besides, the need to share scientific data and results between researchers also highlights the importance of data quality and reliability. The weather research and forecasting (WRF) Model is an open-source NWP model. In this study, we propose a methodology for tracking the WRF model and for generating, storing, and analyzing provenance. We implement the proposed methodology-with a machine learning-based parser, which utilizes classification algorithms to extract provenance information. The proposed approach enables easy management and understanding of numerical weather forecast workflows by providing provenance graphs. By analyzing these graphs, potential faulty situations that may occur during the execution of WRF can be traced to their root causes. Our proposed approach has been evaluated and has been shown to perform well even in a high-frequency provenance information flow.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Provenance Comparison for Large-Scale Knowledge Discovery
    Zhao, Xiang
    Ge, Bin
    Tang, Jiuyang
    Xiao, Weidong
    Shang, Haichuan
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [32] A Performance Analysis of Large Scale Scientific Computing Applications from Log Archives
    Cao, Liqiang
    Liu, Xu
    Xu, Xiaowen
    Liu, Zhanjun
    2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 406 - 411
  • [33] URSPRUNG: Provenance for Large-Scale Analytics Environments
    Rupprecht, Lukas
    Davis, James C.
    Arnold, Constantine
    Lubbock, Alexander
    Tyson, Darren
    Bhagwat, Deepavali
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1989 - 1992
  • [34] Identifying Mixture Components From Large-Scale Keystroke Log Data
    Li, Tingxuan
    FRONTIERS IN PSYCHOLOGY, 2021, 12
  • [35] Towards Automated Log Parsing for Large-Scale Log Data Analysis
    He, Pinjia
    Zhu, Jieming
    He, Shilin
    Li, Jian
    Lyu, Michael R.
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2018, 15 (06) : 931 - 944
  • [36] Enhanced Extraction from Huffman Encoded Files
    Klein, Shmuel T.
    Shapira, Dana
    PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2015, 2015, : 67 - 77
  • [37] A study on information extraction from PDF files
    Fang Yuan
    Bo Liu
    Ge Yu
    ADVANCES IN MACHINE LEARNING AND CYBERNETICS, 2006, 3930 : 258 - 267
  • [38] Large Scale Rorschach Techniques
    不详
    JOURNAL OF CLINICAL PSYCHOLOGY, 1952, 8 (01) : 102 - 102
  • [39] Large Scale Rorschach Techniques
    Allen, Robert M.
    JOURNAL OF SOCIAL PSYCHOLOGY, 1953, 38 (01): : 153 - 155
  • [40] Large Scale Rorschach Techniques
    不详
    BRITISH JOURNAL OF PSYCHOLOGY-GENERAL SECTION, 1945, 36 : 45 - 46