Leveraging State-of-the-Art Engines for Large-Scale Data Analysis in High Energy Physics

被引:0
|
作者
Vincenzo Eduardo Padulano
Ivan Donchev Kabadzhov
Enric Tejedor Saavedra
Enrico Guiraud
Pedro Alonso-Jordá
机构
[1] CERN,EP
[2] Universitat Politècnica de València,SFT
[3] Albert Ludwig University of Freiburg,Department of Computation Systems and Computation
来源
Journal of Grid Computing | 2023年 / 21卷
关键词
Root; High energy physics; Distributed computing; Dask; Spark;
D O I
暂无
中图分类号
学科分类号
摘要
The Large Hadron Collider (LHC) at CERN has generated a vast amount of information from physics events, reaching peaks of TB of data per day which are then sent to large storage facilities. Traditionally, data processing workflows in the High Energy Physics (HEP) field have leveraged grid computing resources. In this context, users have been responsible for manually parallelising the analysis, sending tasks to computing nodes and aggregating the partial results. Analysis environments in this field have had a common building block in the ROOT software framework. This is the de facto standard tool for storing, processing and visualising HEP data. ROOT offers a modern analysis tool called RDataFrame, which can parallelise computations from a single machine to a distributed cluster while hiding most of the scheduling and result aggregation complexity from users. This is currently done by leveraging Apache Spark as the distributed execution engine, but other alternatives are being explored by HEP research groups. Notably, Dask has rapidly gained popularity thanks to its ability to interface with batch queuing systems, widespread in HEP grid computing facilities. Furthermore, future upgrades of the LHC are expected to bring a dramatic increase in data volumes. This paper presents a novel implementation of the Dask backend for the distributed RDataFrame tool in order to address the aforementioned future trends. The scalability of the tool with both the new backend and the already available Spark backend is demonstrated for the first time on more than two thousand cores, testing a real HEP analysis.
引用
收藏
相关论文
共 50 条
  • [21] VIEW GRAPH CONSTRUCTION FOR LARGE-SCALE UAV IMAGES: AN EVALUATION OF STATE-OF-THE-ART METHODS
    Liu, Junhuan
    Ma, Yichen
    Jiang, San
    Li, Qingquan
    Jiang, Wanshou
    Wang, Lizhe
    GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 1059 - 1065
  • [22] Large-Scale Optimization among Photovoltaic and Concentrated Solar Power Systems: A State-of-the-Art Review and Algorithm Analysis
    Wang, Yi'an
    Wu, Zhe
    Ni, Dong
    ENERGIES, 2024, 17 (17)
  • [23] USE OF A DATA-BASE IN THE ONLINE ENVIRONMENT OF A LARGE-SCALE HIGH-ENERGY PHYSICS EXPERIMENT
    SAVOYNAVARRO, A
    COMPUTER PHYSICS COMMUNICATIONS, 1984, 33 (1-3) : 173 - 195
  • [24] Managing Data and Processes in Cloud-Enabled Large-Scale Sensor Networks: State-Of-The-Art and Future Research Directions
    Cuzzocrea, Alfredo
    Fortino, Giancarlo
    Rana, Omer
    PROCEEDINGS OF THE 2013 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID 2013), 2013, : 583 - 588
  • [25] A state-of-the-art survey of model order reduction techniques for large-scale coupled dynamical systems
    Ram Kumar
    D. Ezhilarasi
    International Journal of Dynamics and Control, 2023, 11 : 900 - 916
  • [26] Emerging investigator series: a state-of-the-art review on large-scale desalination technologies and their brine management
    Prabakar, P.
    Thampan, Dyuthi
    Karthika, S.
    Ravichandran, Manthiram Karthik
    Subramanian, Aishwarya
    Nagarajan, Aditya Mosur
    Hussain, Rayhan
    Sivagami, Krishanasamy
    ENVIRONMENTAL SCIENCE-WATER RESEARCH & TECHNOLOGY, 2025, 11 (02) : 167 - 195
  • [27] STATE-OF-THE-ART IN FAILURE MECHANISM AND DAMAGE CONTROL OF LARGE-SCALE URBAN UNDERGROUND BUILDINGS IN CHINA
    Yuan, Yong
    Chen, Zhiyi
    JOURNAL OF EARTHQUAKE AND TSUNAMI, 2010, 4 (01) : 23 - 31
  • [28] Large-Scale Analysis of Art Proportions
    Jensen, Kristoffer
    ARTS AND TECHNOLOGY, 2015, 145 : 137 - 143
  • [29] A Large-Scale Pseudoword-Based Evaluation Framework for State-of-the-Art Word Sense Disambiguation
    Pilehvar, Mohammad Taher
    Navigli, Roberto
    COMPUTATIONAL LINGUISTICS, 2014, 40 (04) : 837 - 881
  • [30] A state-of-the-art survey of model order reduction techniques for large-scale coupled dynamical systems
    Kumar, Ram
    Ezhilarasi, D.
    INTERNATIONAL JOURNAL OF DYNAMICS AND CONTROL, 2023, 11 (02) : 900 - 916