Data Grid tools: enabling science on big distributed data

被引:4
|
作者
Allcock, B [1 ]
Chervenak, A [1 ]
Foster, I [1 ]
Kesselman, C [1 ]
Livny, M [1 ]
机构
[1] Argonne Natl Lab, Argonne, IL 60439 USA
关键词
D O I
10.1088/1742-6596/16/1/079
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A particularly demanding and important challenge that we face as we attempt to construct the distributed computing machinery required to support SciDAC goals is the efficient, high-performance, reliable, secure, and policy-aware management of large-scale data movement. This problem is fundamental to diverse application domains including experimental physics (high energy physics, nuclear physics, light sources), simulation science (climate, computational chemistry, fusion, astrophysics), and large-scale collaboration. In each case, highly distributed user communities require high-speed access to valuable data, whether for visualization or analysis. The quantities of data involved (terabytes to petabytes), the scale of the demand (hundreds or thousands of users, data-intensive analyses, real-time constraints), and the complexity of the infrastructure that must be managed (networks, tertiary storage systems, network caches, computers, visualization systems) make the problem extremely challenging. Data management tools developed under the auspices of the SciDAC Data Grid Middleware project have become the de facto standard for data management in projects worldwide. Day in and day out, these tools provide the "plumbing" that allows scientists to do more science on an unprecedented scale in production environments.
引用
收藏
页码:571 / 575
页数:5
相关论文
共 50 条
  • [31] Big Data Meets Big Science
    Wright, Alex
    COMMUNICATIONS OF THE ACM, 2014, 57 (07) : 13 - 15
  • [32] Big Data Challenges in Big Science
    Andreas Heiss
    Computing and Software for Big Science, 2019, 3 (1)
  • [33] Big science and big data in nephrology
    Saez-Rodriguez, Julio
    Rinschen, Markus M.
    Floege, Juergen
    Kramann, Rafael
    KIDNEY INTERNATIONAL, 2019, 95 (06) : 1326 - 1337
  • [34] Big Data, Big Impact: The Potential for Data Science in Neurosurgery
    Panesar, Sandip S.
    Fernandez-Miranda, Juan
    WORLD NEUROSURGERY, 2020, 138 : 558 - 559
  • [35] Enabling Smart Data: Noise filtering in Big Data classification
    Garcia-Gil, Diego
    Luengo, Julian
    Garcia, Salvador
    Herrera, Francisco
    INFORMATION SCIENCES, 2019, 479 : 135 - 152
  • [36] Sub-Grid Partitioning Algorithm for Distributed Outlier Detection on Big Data
    Sakr, Mohamed
    Atwa, Walid
    Keshk, Arabi
    PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2018, : 252 - 257
  • [37] Distributed Generation Grid-Connected Method Based on Big Data Analysis
    Zhang, Zhen
    Wang, Cunxu
    Wu, Tong
    Yang, Kaifan
    Yao, Tian
    CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019), 2019,
  • [38] The Data Mine: Enabling Data Science Across the Curriculum
    Gundlach, Ellen
    Ward, Mark Daniel
    JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION, 2021, 29 : S74 - S82
  • [39] Editorial: Big data and data science in sport
    D'Urso, Pierpaolo
    De Giovanni, Livia
    Swartz, Tim
    ANNALS OF OPERATIONS RESEARCH, 2023, 325 (01) : 1 - 7
  • [40] Some Comments on Big Data and Data Science
    Gu J.
    Zhang L.
    Annals of Data Science, 2014, 1 (3-4) : 283 - 291