DROP Computing: Data Driven Pipeline Processing for the SKA

被引:0
|
作者
Wicenec, Andreas [1 ]
Pallot, Dave [1 ]
Tobar, Rodrigo [1 ]
Wu, Chen [1 ]
机构
[1] Univ Western Australia, ICRAR, Perth, WA, Australia
关键词
D O I
暂无
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
The correlator output of the SKA arrays will be of the order of 1 TB/s. That data rate will have to be processed by the Science Data Processor using dedicated HPC infrastructure in both Australia and South Africa. Radio astronomical processing in principle is thought to be highly data parallel, with little to no communication required between individual tasks. Together with the ever increasing number of cores (CPUs) and stream processors (GPUs) this led us to step back and think about the traditional pipeline and task driven approach on a more fundamental level. We have thus started to look into dataflow representations (Dennis & Misunas 1974) and data flow programming models (Davis 1978) as well as data flow languages (Johnston et al. 2004) and scheduling (Benoit et al. 2014). We have investigated a number of existing systems and prototyped some implementations using simplified, but real radio astronomy workflows. Despite the fact that many of these approaches are already focussing on data and dataflow as the most critical component, we still missed a rigorously data driven approach, where the data itself is essentially driving the whole process. In this talk we will present the new concept of DROP Computing (condensed data cloud), which is an integral part of the current SKA Data Layer architecture. In short a DROP is an abstract class, instances of which represent data (DataDrop), collections of DROPs (Container Drop), but also applications (ApplicationDrop, e.g. pipeline components). The rest are just details, which will be presented in the talk.
引用
收藏
页码:319 / 328
页数:10
相关论文
共 50 条
  • [21] COMPUTING AND DATA-PROCESSING
    CLEMMER, TA
    ENVIRONMENTAL HEALTH PERSPECTIVES, 1977, 20 (OCT) : 248 - 248
  • [22] Metadata-driven processing in the BIMA image pipeline
    Plante, RL
    Guillaume, D
    Mehringer, DM
    Crutcher, RM
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XI, 2002, 281 : 346 - 350
  • [23] Developing a data pipeline solution for big data processing
    Lipovac, Ivona
    Babac, Marina Bagic
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2024, 16 (01) : 1 - 22
  • [24] SKA and LHC tackle extreme computing
    不详
    ASTRONOMY & GEOPHYSICS, 2017, 58 (06) : 7 - 7
  • [25] An investigation on the coupling of data-driven computing and model-driven computing
    Yang, Jie
    Huang, Wei
    Huang, Qun
    Hu, Heng
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2022, 393
  • [26] The DASCH Data Processing Pipeline and Multiple Exposure Plate Processing
    Los, Edward
    Grindlay, Jonathan
    Tang, Sumin
    Servillat, Mathieu
    Laycock, Silas
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XX, 2011, 442 : 269 - +
  • [27] Data transport for the SKA
    Grainge, Keith J. B.
    2015 1st URSI Atlantic Radio Science Conference (URSI AT-RASC), 2015,
  • [28] Data transport for the SKA
    Grainge, Keith J. B.
    2016 UNITED STATES NATIONAL COMMITTEE OF URSI NATIONAL RADIO SCIENCE MEETING (USNC-URSI NRSM), 2016,
  • [29] Data-driven computing in dynamics
    Kirchdoerfer, T.
    Ortiz, M.
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2018, 113 (11) : 1697 - 1710
  • [30] A Data Driven Method for Computing Quasipotentials
    Lin, Bo
    Li, Qianxiao
    Ren, Weiqing
    MATHEMATICAL AND SCIENTIFIC MACHINE LEARNING, VOL 145, 2021, 145 : 652 - +