Python']Python and HPC for High Energy Physics Data Analyses

被引:5
|
作者
Sehrish, S. [1 ]
Kowalkowski, J. [1 ]
Paterno, M. [1 ]
Green, C. [1 ]
机构
[1] Fermilab Natl Accelerator Lab, POB 500, Batavia, IL 60510 USA
关键词
HEP analysis; MPI; !text type='Python']Python[!/text; HPC; HDF5; numpy; pandas; mpi4py;
D O I
10.1145/3149869.3149877
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High level abstractions in Python that can utilize computing hardware well seem to be an attractive option for writing data reduction and analysis tasks. In this paper, we explore the features available in Python which are useful and efficient for end user analysis in High Energy Physics (HEP). A typical vertical slice of an HEP data analysis is somewhat fragmented: the state of the reduction/analysis process must be saved at certain stages to allow for selective reprocessing of only parts of a generally time-consuming workflow. Also, algorithms tend to to be modular because of the heterogeneous nature of most detectors and the need to analyze different parts of the detector separately before combining the information. This fragmentation causes difficulties for interactive data analysis, and as data sets increase in size and complexity (O10 TiB for a "small" neutrino experiment to the O10 PiB currently held by the CMS experiment at the LHC), data analysis methods traditional to the field must evolve to make optimum use of emerging HPC technologies and platforms. Mainstream big data tools, while suggesting a direction in terms of what can be done if an entire data set can be available across a system and analysed with high-level programming abstractions, are not designed with either scientific computing generally, or modern HPC platform features in particular, such as data caching levels, in mind. Our example HPC use case is a search for a new elementary particle which might explain the phenomenon known as "Dark Matter". Using data from the CMS detector, we will use HDF5 as our input data format, and MPI with Python to implement our use case.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Data Engineering for HPC with Python']Python
    Abeykoon, Vibhatha
    Perera, Niranda
    Widanage, Chathura
    Kamburugamuve, Supun
    Kanewalat, Thejaka Amila
    Maithree, Hasara
    Wickramasinghe, Pulasthi
    Uyar, Ahmet
    Fox, Geoffrey
    [J]. PROCEEDINGS OF PYHPC 2020: 2020 IEEE/ACM 9TH WORKSHOP ON PYTHON FOR HIGH-PERFORMANCE AND SCIENTIFIC COMPUTING (PYHPC), 2020, : 13 - 21
  • [2] A Python']Python package for particle physics analyses
    Bevan, Adrian
    Charman, Thomas
    Hays, Jonathan
    [J]. 23RD INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2018), 2019, 214
  • [3] Spark and HPC for High Energy Physics Data Analyses
    Sehrish, Saba
    Kowalkowski, Jim
    Paterno, Marc
    [J]. 2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 1048 - 1057
  • [4] Python']Python Workflows on HPC Systems
    Strassel, Dominik
    Reusch, Philipp
    Keuper, Janis
    [J]. PROCEEDINGS OF PYHPC 2020: 2020 IEEE/ACM 9TH WORKSHOP ON PYTHON FOR HIGH-PERFORMANCE AND SCIENTIFIC COMPUTING (PYHPC), 2020, : 32 - 40
  • [5] WATCHMAN project-A Python']Python CASE framework for High Energy Physics data analysis in the LHC era
    Bianchi, Riccardo Maria
    Bruneliere, Renaud
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2013, 4 (05) : 325 - 333
  • [6] Python']PythonFOAM: In-situ data analyses with OpenFOAM and Python']Python
    Maulik, Romit
    Fytanidis, Dimitrios K.
    Lusch, Bethany
    Vishwanath, Venkatram
    Patel, Saumil
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 62
  • [7] The Space Physics Environment Data Analysis System in Python']Python
    Grimes, Eric W. W.
    Harter, Bryan
    Hatzigeorgiu, Nick
    Drozdov, Alexander
    Lewis, James W. W.
    Angelopoulos, Vassilis
    Cao, Xin
    Chu, Xiangning
    Hori, Tomo
    Matsuda, Shoya
    Jun, Chae-Woo
    Nakamura, Satoko
    Kitahara, Masahiro
    Segawa, Tomonori
    Miyoshi, Yoshizumi
    Le Contel, Olivier
    [J]. FRONTIERS IN ASTRONOMY AND SPACE SCIENCES, 2022, 9
  • [8] Computational Physics with Python']Python
    Landau, Rubin H.
    Bordeianu, Cristian C.
    Paez, Manuel J.
    [J]. ICVL 2009 - PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON VIRTUAL LEARNING, 2009, : 112 - +
  • [9] Linkage of XcalableMP and Python']Python languages for high productivity on HPC cluster system
    Nakao, Masahiro
    Murai, Hitoshi
    Boku, Taisuke
    Sato, Mitsuhisa
    [J]. HPC ASIA'18: PROCEEDINGS OF WORKSHOPS OF HPC ASIA, 2018, : 39 - 47
  • [10] Soil Physics with Python']Python
    Swingler, K.
    [J]. EUROPEAN JOURNAL OF SOIL SCIENCE, 2015, 66 (05) : 963 - 963