On the provenance extraction techniques from large scale log files

被引:2
|
作者
Tufek, Alper [1 ]
Aktas, Mehmet S. [1 ]
机构
[1] Yildiz Tech Univ, Comp Engn Dept, Istanbul, Turkey
来源
关键词
machine learning-based provenance extraction; numerical weather prediction models; provenance; provenance analysis;
D O I
10.1002/cpe.6559
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Numerical weather prediction (NWP) models are the most important instruments to predict future weather. Provenance information is of central importance for detecting unexpected events that may develop during the long course of model execution. Besides, the need to share scientific data and results between researchers also highlights the importance of data quality and reliability. The weather research and forecasting (WRF) Model is an open-source NWP model. In this study, we propose a methodology for tracking the WRF model and for generating, storing, and analyzing provenance. We implement the proposed methodology-with a machine learning-based parser, which utilizes classification algorithms to extract provenance information. The proposed approach enables easy management and understanding of numerical weather forecast workflows by providing provenance graphs. By analyzing these graphs, potential faulty situations that may occur during the execution of WRF can be traced to their root causes. Our proposed approach has been evaluated and has been shown to perform well even in a high-frequency provenance information flow.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Provenance for Longitudinal Analysis in Large Scale Networks
    Stoica, Andrei
    Riveni, Mirela
    SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2024, PT II, 2025, 15212 : 274 - 285
  • [22] Visualizing Large Scale Scientific Data Provenance
    Chen, Peng
    Plale, Beth
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1387 - 1387
  • [23] Visualizing Large Scale Scientific Data Provenance
    Chen, Peng
    Plale, Beth
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1388 - 1388
  • [24] Extraction of error detection rules without supervised information from log files using automatically defined groups
    Kurosawa, Yoshiaki
    Hara, Akira
    Icbmura, Takumi
    Kawano, Yuji
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 5314 - +
  • [25] Data Extraction of XML Files using Searching and Indexing Techniques
    Satpute, Sushma
    Katkar, Vaishali
    Sahare, Nilesh
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 29, 2008, 29 : 408 - 414
  • [26] Poster: Using Provenance to Visualize Data from Large-Scale Experiments
    Horta, Felipe
    Dias, Jonas
    Ocana, Kary A. C. S.
    de Oliveira, Daniel
    Ogasawara, Eduardo
    Mattoso, Marta
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1417 - 1418
  • [27] Large-Scale Identification of Malicious Singleton Files
    Li, Bo
    Roundy, Kevin
    Gates, Chris
    Vorobeychik, Yevgeniy
    PROCEEDINGS OF THE SEVENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY (CODASPY'17), 2017, : 227 - 238
  • [28] A Method of Large - Scale Log Pattern Mining
    Li, Lu
    Man, Yi
    Chen, Mo
    HUMAN CENTERED COMPUTING, HCC 2017, 2018, 10745 : 76 - 84
  • [29] AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH ENGINE LOG FILES
    Veilumuthu, Ashok
    Ramachandran, Parthasarathy
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 269 - 274
  • [30] Discovering implicit feedbacks from search engine log files
    Veilumuthu, Ashok
    Ramachandran, Parthasarathy
    DISCOVERY SCIENCE, PROCEEDINGS, 2007, 4755 : 231 - +