Scientific data management in a Grid environment

被引:0
|
作者
James H.A. [1 ]
Hawick K.A. [1 ]
机构
[1] Institute of Information and Mathematical Sciences, Massey University, Auckland
关键词
Data Management; Data mining; Grid systems; Metadata; Parameter cross-products;
D O I
10.1007/s10723-005-5464-y
中图分类号
学科分类号
摘要
Managing scientific data is by no means a trivial task even in a single site environment with a small number of researchers involved. We discuss some issues concerned with posing well-specified experiments in terms of parameters or instrument settings and the metadata framework that arises from doing so. We are particularly interested in parallel computer simulation experiments, where very large quantities of warehouse-able data are involved, run in a multi-site Grid environment. We consider SQL databases and other framework technologies for manipulating experimental data. Our framework manages the outputs from parallel runs that arise from large cross-products of parameter combinations. Considerable useful experiment planning and analysis can be done with the sparse metadata without fully expanding the parameter cross-products. Extra value can be obtained from simulation output that can subsequently be data-mined. We have particular interests in running large scale Monte Carlo physics model simulations. Finding ourselves overwhelmed by the problems of managing data and compute resources, we have built a prototype tool using Java and MySQL that addresses these issues. We use this example to discuss type-space management and other fundamental ideas for implementing a laboratory information management system. © Springer 2005.
引用
收藏
页码:39 / 51
页数:12
相关论文
共 50 条
  • [1] Data discovery algorithm for scientific data grid environment
    Abdullah, A
    Othman, M
    Sulaiman, MN
    Ibrahim, H
    Othman, AT
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2005, 65 (11) : 1429 - 1434
  • [2] A grid environment for data integration of scientific databases
    Matsuda, H
    First International Conference on e-Science and Grid Computing, Proceedings, 2005, : 3 - 4
  • [3] A GRelC based data grid management environment
    Fiore, S.
    Mirto, M.
    Cafaro, M.
    Vadacca, S.
    Negro, A.
    Aloisio, G.
    PROCEEDINGS OF THE 21ST IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, 2008, : 355 - +
  • [4] Autonomic data management system in grid environment
    Thi-Mai-Huong Nguyen
    Magoules, Frederic
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2009, 3 (01) : 155 - 177
  • [5] Network storage management in data grid environment
    Yang, SF
    Ali, Z
    Kettani, H
    Verma, V
    Malluhi, Q
    GRID AND COOPERATIVE COMPUTING, PT 2, 2004, 3033 : 879 - 886
  • [6] Scientific data management architecture for grid computing environments
    No, J
    Cuong, NT
    Park, SS
    GRID AND COOPERATIVE COMPUTING - GCC 2005, PROCEEDINGS, 2005, 3795 : 541 - 546
  • [7] SCE: Grid Environment for Scientific Computing
    Xiao, Haili
    Wu, Hong
    Chi, Xuebin
    NETWORKS FOR GRID APPLICATIONS, 2009, 2 : 35 - 42
  • [8] Data discovery mechanism for a large peer-to-peer based scientific data grid environment
    Abdullah, A
    Othman, M
    Sulaiman, MN
    Ibrahim, H
    Othman, AT
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 2, 2004, 3044 : 146 - 157
  • [9] Scientific Data Sharing Using Clustered-based Data Sharing (CDS) in Grid Environment
    Latip, Rohaya
    Ibrahim, Hamidah
    Al-Hanandeh, Feras Ahmad
    PROCEEDINGS OF KNOWLEDGE MANAGEMENT 5TH INTERNATIONAL CONFERENCE 2010, 2010, : 579 - 582
  • [10] A split&merge data management architecture for a grid environment
    Aloisio, Giovanni
    Cafaro, Massimo
    Fiore, Sandro
    Mirto, Maria
    19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2006, : 739 - +