Metadata management for scientific databases

被引:10
|
作者
Pinoli, Pietro [1 ]
Ceri, Stefano [1 ]
Martinenghi, Davide [1 ]
Nanni, Luca [1 ]
机构
[1] Politecn Milan, Milan, Italy
基金
欧洲研究理事会;
关键词
Metadata management; Scientific databases; Query optimization; PROVENANCE;
D O I
10.1016/j.is.2018.10.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most scientific databases consist of datasets (or sources) which in turn include samples (or files) with an identical structure (or schema). In many cases, samples are associated with rich metadata, describing the process that leads to building them (e.g.: the experimental conditions used during sample generation). Metadata are typically used in scientific computations just for the initial data selection; at most, metadata about query results is recovered after executing the query, and associated with its results by post-processing. In this way, a large body of information that could be relevant for interpreting query results goes unused during query processing. In this paper, we present ScQL, a new algebraic relational language, whose operations apply to objects consisting of data-metadata pairs, by preserving such one-to-one correspondence throughout the computation. We formally define each operation and we describe an optimization, called meta first, that may significantly reduce the query processing overhead by anticipating the use of metadata for selectively loading into the execution environment only those input samples that contribute to the result samples. In ScQL, metadata have the same relevance as data, and contribute to building query results; in this way, the resulting samples are systematically associated with metadata about either the specific input samples involved or about query processing, thereby yielding a new form of metadata provenance. We present many examples of use of ScQL, relative to several application domains, and we demonstrate the effectiveness of the meta-first optimization. (C) 2018 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [1] Metadata management in outsourced encrypted databases
    Damiani, E
    di Vimercati, SD
    Foresti, S
    Jajodia, S
    Paraboschi, S
    Samarati, P
    [J]. SECURE DATA MANAGEMENT, PROCEEDINGS, 2005, 3674 : 16 - 32
  • [2] Metadata management for clinical research databases
    Chong, QD
    Lee, YY
    Medhi, D
    Kerns, K
    Spertus, J
    Coffman, M
    [J]. AMIA 2002 SYMPOSIUM, PROCEEDINGS: BIOMEDICAL INFORMATICS: ONE DISCIPLINE, 2002, : 999 - 999
  • [3] Embedding knowledge in scientific databases via concept maps as metadata
    Messerotti, M
    [J]. SOLSPA 2001: PROCEEDINGS OF THE SECOND SOLAR CYCLE AND SPACE WEATHER EUROCONFERENCE, 2002, 477 : 607 - 610
  • [4] Technologies for metadata management in scientific articles
    Castro-Romero, Alexander
    Gonzalez-Sanabria, Juan S.
    Ballesteros-Ricaurte, Javier A.
    [J]. INGENIERIA Y COMPETITIVIDAD, 2015, 17 (02): : 123 - 134
  • [5] Scientific data management with navigational metadata
    Stillerman, J.
    Greenwald, M.
    Wright, J.
    [J]. FUSION ENGINEERING AND DESIGN, 2018, 128 : 113 - 116
  • [6] Filling information management gaps of forest dynamics plot databases using ecological metadata language and a scientific workflow system
    Lin, Chau-Chin
    Hsiao, Chi-Wen
    Lu, Sheng-Shan
    Chiou, Wen-Liang
    Chang, Li-Wan
    Jeng, Meei-Ru
    [J]. Taiwan Journal of Forest Science, 2010, 25 (01): : 97 - 105
  • [7] Exploring Metadata Search Essentials for Scientific Data Management
    Zhang, Wei
    Byna, Suren
    Niu, Chenxu
    Chen, Yong
    [J]. 2019 IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC), 2019, : 83 - 92
  • [8] Querying Web Metadata:: Native score management and text support in databases
    Özsoyoglu, G
    Altingövde, IS
    Al-Hamdani, A
    Özel, SA
    Ulusoy, Ö
    Özsoyoglu, ZM
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2004, 29 (04): : 581 - 634
  • [9] EMPRESS: Accelerating Scientific Discovery through Descriptive Metadata Management
    Lawson, Margaret
    Gropp, William
    Lofstead, Jay
    [J]. ACM TRANSACTIONS ON STORAGE, 2022, 18 (04)
  • [10] Design of metadata in a hydrological integrated scientific data management system
    Liu, ZP
    Liang, Y
    [J]. PROCEEDINGS OF THE 7TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2003, : 418 - 421