Metadata management for scientific databases

被引:10
|
作者
Pinoli, Pietro [1 ]
Ceri, Stefano [1 ]
Martinenghi, Davide [1 ]
Nanni, Luca [1 ]
机构
[1] Politecn Milan, Milan, Italy
基金
欧洲研究理事会;
关键词
Metadata management; Scientific databases; Query optimization; PROVENANCE;
D O I
10.1016/j.is.2018.10.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most scientific databases consist of datasets (or sources) which in turn include samples (or files) with an identical structure (or schema). In many cases, samples are associated with rich metadata, describing the process that leads to building them (e.g.: the experimental conditions used during sample generation). Metadata are typically used in scientific computations just for the initial data selection; at most, metadata about query results is recovered after executing the query, and associated with its results by post-processing. In this way, a large body of information that could be relevant for interpreting query results goes unused during query processing. In this paper, we present ScQL, a new algebraic relational language, whose operations apply to objects consisting of data-metadata pairs, by preserving such one-to-one correspondence throughout the computation. We formally define each operation and we describe an optimization, called meta first, that may significantly reduce the query processing overhead by anticipating the use of metadata for selectively loading into the execution environment only those input samples that contribute to the result samples. In ScQL, metadata have the same relevance as data, and contribute to building query results; in this way, the resulting samples are systematically associated with metadata about either the specific input samples involved or about query processing, thereby yielding a new form of metadata provenance. We present many examples of use of ScQL, relative to several application domains, and we demonstrate the effectiveness of the meta-first optimization. (C) 2018 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [31] METADATA MANAGEMENT
    MARK, L
    ROUSSOPOULOS, N
    [J]. COMPUTER, 1986, 19 (12) : 26 - 36
  • [32] Materials Databases: The Need for Open, Interoperable Databases with Standardized Data and Rich Metadata
    Coudert, Francois-Xavier
    [J]. ADVANCED THEORY AND SIMULATIONS, 2019, 2 (11)
  • [33] Usability in Scientific Databases
    Suduc, Ana-Maria
    Bizoi, Mihai
    Filip, Florin Gheorghe
    [J]. COMPUTER SCIENCE JOURNAL OF MOLDOVA, 2012, 20 (02) : 147 - 162
  • [34] Intelligent Information Management and Knowledge Discovery in Large Numeric and Scientific Databases
    Patrick Perrin
    Frederick E. Petry & William Thomason(Center for Intelligent and Knowledge-Based Systems)(Computer Science Department
    [J]. Journal of Systems Engineering and Electronics, 1996, (02) : 73 - 86
  • [35] Device-driven Metadata Management Solutions for Scientific Big Data Use Cases
    Grunzke, Richard
    Mueller-Pfefferkorn, Ralph
    Jaekel, Rene
    Starek, Juergen
    Hardt, Marcus
    Hartmann, Volker
    Potthoff, Jan
    Hesser, Juergen
    Kepper, Nick
    Gesing, Sandra
    Kindermann, Stephan
    [J]. 2014 22ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2014), 2014, : 317 - 321
  • [36] Metadata Encoding for the Levels of Scientific Research
    Peponi, Nikoletta
    Poulos, Marios
    Pappas, Theodoros
    [J]. METADATA AND SEMANTICS, 2009, : 81 - +
  • [37] Metadata's role in a scientific archive
    Thomson, J
    Adams, D
    Cowley, PJ
    Walker, K
    [J]. COMPUTER, 2003, 36 (12) : 27 - +
  • [38] HELIPORT: A Portable Platform for FAIR {Workflow | Metadata | Scientific Project Lifecycle} Management and Everything
    Knodel, Oliver
    Voigt, Martin
    Ufer, Robert
    Pape, David
    Lokamani, Mani
    Mueller, Stefan E.
    Gruber, Thomas
    Juckeland, Guido
    [J]. PROCEEDINGS OF THE 4TH INTERNATIONAL WORKSHOP ON PRACTICAL REPRODUCIBLE EVALUATION OF COMPUTER SYSTEMS, P-RECS 2021, 2021, : 9 - 14
  • [39] Data and metadata collections for scientific applications
    Rajasekar, AK
    Moore, RW
    [J]. HIGH-PERFORMANCE COMPUTING AND NETWORKING, 2001, 2110 : 72 - 80
  • [40] Managing scientific metadata using XML
    Yang, RX
    Kafatos, M
    Wang, XS
    [J]. IEEE INTERNET COMPUTING, 2002, 6 (04) : 52 - 59