DROP Computing: Data Driven Pipeline Processing for the SKA

被引:0
|
作者
Wicenec, Andreas [1 ]
Pallot, Dave [1 ]
Tobar, Rodrigo [1 ]
Wu, Chen [1 ]
机构
[1] Univ Western Australia, ICRAR, Perth, WA, Australia
关键词
D O I
暂无
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
The correlator output of the SKA arrays will be of the order of 1 TB/s. That data rate will have to be processed by the Science Data Processor using dedicated HPC infrastructure in both Australia and South Africa. Radio astronomical processing in principle is thought to be highly data parallel, with little to no communication required between individual tasks. Together with the ever increasing number of cores (CPUs) and stream processors (GPUs) this led us to step back and think about the traditional pipeline and task driven approach on a more fundamental level. We have thus started to look into dataflow representations (Dennis & Misunas 1974) and data flow programming models (Davis 1978) as well as data flow languages (Johnston et al. 2004) and scheduling (Benoit et al. 2014). We have investigated a number of existing systems and prototyped some implementations using simplified, but real radio astronomy workflows. Despite the fact that many of these approaches are already focussing on data and dataflow as the most critical component, we still missed a rigorously data driven approach, where the data itself is essentially driving the whole process. In this talk we will present the new concept of DROP Computing (condensed data cloud), which is an integral part of the current SKA Data Layer architecture. In short a DROP is an abstract class, instances of which represent data (DataDrop), collections of DROPs (Container Drop), but also applications (ApplicationDrop, e.g. pipeline components). The rest are just details, which will be presented in the talk.
引用
收藏
页码:319 / 328
页数:10
相关论文
共 50 条
  • [41] An Extensible Parsing Pipeline for Unstructured Data Processing
    Jain, Shubham
    de Buitleir, Amy
    Fallon, Enda
    2022 24TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ARITIFLCIAL INTELLIGENCE TECHNOLOGIES TOWARD CYBERSECURITY, 2022, : 312 - +
  • [42] The GONG plus plus data processing pipeline
    Hill, F
    Bolding, J
    Toner, C
    Corbard, T
    Wampler, S
    Goodrich, B
    Goodrich, J
    Eliason, P
    Hanna, KD
    PROCEEDINGS OF SOHO 12/GONG (PLUS) 2002 ON LOCAL AND GLOBAL HELIOSEISMOLOGY: THE PRESENT AND FUTURE, 2003, 517 : 295 - 298
  • [43] Data Processing Pipeline for the Earth 2.0 mission
    Yao, Xinyu
    Ge, Jian
    Zhang, Hui
    Willis, Kevin
    Zhu, Jiapeng
    SPACE TELESCOPES AND INSTRUMENTATION 2022: OPTICAL, INFRARED, AND MILLIMETER WAVE, 2022, 12180
  • [44] ArrayPipe: a flexible processing pipeline for microarray data
    Hokamp, K
    Roche, FM
    Acab, M
    Rousseau, ME
    Kuo, B
    Goode, D
    Aeschliman, D
    Bryan, J
    Babiuk, LA
    Hancock, REW
    Brinkman, FSL
    NUCLEIC ACIDS RESEARCH, 2004, 32 : W457 - W459
  • [45] PHANGS-ALMA Data Processing and Pipeline
    Leroy, Adam K.
    Hughes, Annie
    Liu, Daizhong
    Pety, Jerome
    Rosolowsky, Erik
    Saito, Toshiki
    Schinnerer, Eva
    Schruba, Andreas
    Usero, Antonio
    Faesi, Christopher M.
    Herrera, Cinthya N.
    Chevance, Melanie
    Hygate, Alexander P. S.
    Kepley, Amanda A.
    Koch, Eric W.
    Querejeta, Miguel
    Sliwa, Kazimierz
    Will, David
    Wilson, Christine D.
    Anand, Gagandeep S.
    Barnes, Ashley
    Belfiore, Francesco
    Beslic, Ivana
    Bigiel, Frank
    Blanc, Guillermo A.
    Bolatto, Alberto D.
    Boquien, Mederic
    Cao, Yixian
    Chandar, Rupali
    Chastenet, Jeremy
    Chiang, I-Da
    Congiu, Enrico
    Dale, Daniel A.
    Deger, Sinan
    den Brok, Jakob S.
    Eibensteiner, Cosima
    Emsellem, Eric
    Garcia-Rodriguez, Axel
    Glover, Simon C. O.
    Grasha, Kathryn
    Groves, Brent
    Henshaw, Jonathan D.
    Jimenez Donaire, Maria J.
    Kim, Jaeyeon
    Klessen, Ralf S.
    Kreckel, Kathryn
    Kruijssen, J. M. Diederik
    Larson, Kirsten L.
    Lee, Janice C.
    Mayker, Ness
    ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2021, 255 (01):
  • [46] A Data Pipeline for Extraction and Processing of Electrocardiogram Recordings
    Prim, Joshua
    Uhlemann, Tim
    Gumpfer, Nils
    Gruen, Dimitri
    Wegener, Sebastian
    Krug, Sabrina
    Hannig, Jennifer
    Keller, Till
    Guckert, Michael
    2021 COMPUTING IN CARDIOLOGY (CINC), 2021,
  • [47] Data Driven Computing with noisy material data sets
    Kirchdoerfer, T.
    Ortiz, M.
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2017, 326 : 622 - 641
  • [48] Solar Data Tools: Automatic Solar Data Processing Pipeline
    Meyers, Bennet E.
    Apostolaki-Iosifidou, Elpiniki
    Schelhas, Laura T.
    2020 47TH IEEE PHOTOVOLTAIC SPECIALISTS CONFERENCE (PVSC), 2020, : 655 - 656
  • [49] COMPUTING AND DATA-PROCESSING NEWSLETTER
    PUZ, R
    COMPUTERS AND PEOPLE, 1976, 25 (09): : 22 - 24
  • [50] INDUSTRIAL COMPUTING, PROCESSING AND DATA ACQUISITION
    MASON, BL
    AUSTRALIAN JOURNAL OF INSTRUMENTATION & CONTROL, 1977, 33 (01): : 5 - 8