Data Grid tools: enabling science on big distributed data

被引:4
|
作者
Allcock, B [1 ]
Chervenak, A [1 ]
Foster, I [1 ]
Kesselman, C [1 ]
Livny, M [1 ]
机构
[1] Argonne Natl Lab, Argonne, IL 60439 USA
关键词
D O I
10.1088/1742-6596/16/1/079
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A particularly demanding and important challenge that we face as we attempt to construct the distributed computing machinery required to support SciDAC goals is the efficient, high-performance, reliable, secure, and policy-aware management of large-scale data movement. This problem is fundamental to diverse application domains including experimental physics (high energy physics, nuclear physics, light sources), simulation science (climate, computational chemistry, fusion, astrophysics), and large-scale collaboration. In each case, highly distributed user communities require high-speed access to valuable data, whether for visualization or analysis. The quantities of data involved (terabytes to petabytes), the scale of the demand (hundreds or thousands of users, data-intensive analyses, real-time constraints), and the complexity of the infrastructure that must be managed (networks, tertiary storage systems, network caches, computers, visualization systems) make the problem extremely challenging. Data management tools developed under the auspices of the SciDAC Data Grid Middleware project have become the de facto standard for data management in projects worldwide. Day in and day out, these tools provide the "plumbing" that allows scientists to do more science on an unprecedented scale in production environments.
引用
收藏
页码:571 / 575
页数:5
相关论文
共 50 条
  • [41] Big Data and Data Science in Critical Care
    Sanchez-Pinto, L. Nelson
    Luo, Yuan
    Churpek, Matthew M.
    CHEST, 2018, 154 (05) : 1239 - 1248
  • [42] Data science, big data and granular mining
    Pal, Sankar K.
    Meher, Saroj K.
    Skowron, Andrzej
    PATTERN RECOGNITION LETTERS, 2015, 67 : 109 - 112
  • [43] A dynamic data classification techniques and tools for big data
    Rani, T. Usha
    Priyanka, C. H. Sindhu
    Monica, B. S. S.
    INTERNATIONAL CONFERENCE ON COMPUTER VISION AND MACHINE LEARNING, 2019, 1228
  • [44] Big Data Analytics, Data Science and the CIS
    Yao, Xin
    IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2015, 10 (01) : 4 - 5
  • [45] Distributed data mining on the grid
    Cannataro, M
    Talia, D
    Trunfio, P
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2002, 18 (08): : 1101 - 1112
  • [46] Comments on: Data science, big data and statistics
    Cao, Ricardo
    TEST, 2019, 28 (03) : 664 - 670
  • [47] Comments on: Data science, big data and statistics
    Ruey S. Tsay
    TEST, 2019, 28 : 357 - 359
  • [48] Editorial: Big data and data science in sport
    Pierpaolo D’Urso
    Livia De Giovanni
    Tim Swartz
    Annals of Operations Research, 2023, 325 : 1 - 7
  • [49] Comments on: Data science, big data and statistics
    Marron, J. S.
    TEST, 2019, 28 (02) : 342 - 344
  • [50] Distributed data mining on the grid
    Jiang, WS
    Yu, JH
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 2010 - 2014