I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning

被引：50

作者：

Chowdhury, Fahim ^{[1
]}

Zhu, Yue ^{[1
]}

Heer, Todd ^{[2
]}

Paredes, Saul ^{[1
]}

Moody, Adam ^{[2
]}

Goldstone, Robin ^{[2
]}

Mohror, Kathryn ^{[2
]}

Yu, Weikuan ^{[1
]}

机构：

[1] Florida State Univ, Tallahassee, FL 32306 USA

[2] Lawrence Livermore Natl Lab, Livermore, CA USA

来源：

PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019) | 2019年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1145/3337821.3337902

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Parallel File Systems (PFSs) are frequently deployed on leadership High Performance Computing (HPC) systems to ensure efficient I/O, persistent storage and scalable performance. Emerging Deep Learning (DL) applications incur new I/O and storage requirements to HPC systems with batched input of small random files. This mandates PFSs to have commensurate features that can meet the needs of DL applications. BeeGFS is a recently emerging PFS that has grabbed the attention of the research and industry world because of its performance, scalability and ease of use. While emphasizing a systematic performance analysis of BeeGFS, in this paper, we present the architectural and system features of BeeGFS, and perform an experimental evaluation using cutting-edge I/O, Metadata and DL application benchmarks. Particularly, we have utilized AlexNet and ResNet-50 models for the classification of ImageNet dataset using the Livermore Big Artificial Neural Network Toolkit (LBANN), and ImageNet data reader pipeline atop TensorFlow and Horovod. Through extensive performance characterization of BeeGFS, our study provides a useful documentation on how to leverage BeeGFS for the emerging DL applications.

引用

页数：10

共 50 条

[1] The role of storage target allocation in applications' I/O performance with BeeGFS
Boito, Francieli
Pallez, Guillaume
Teylo, Luan
2022 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2022), 2022, : 267 - 277
[2] Does Varying BeeGFS Configuration Affect the I/O Performance of HPC Workloads?
Borkar, Arnav
Tony, Joel
Vamsi, Hari K. N.
Barman, Tushar
Bhisikar, Yash
Sreenath, T. M.
Paul, Arnab K.
2023 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING WORKSHOPS, CLUSTER WORKSHOPS, 2023, : 5 - 7
[3] High Performance I/O For Large Scale Deep Learning
Aizman, Alex
Maltby, Gavin
Breuel, Thomas
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 5965 - 5967
[4] Improving the I/O of large geophysical models using PnetCDF and BeeGFS
Brzenski, Jared
Paolini, Christopher
Castillo, Jose E.
PARALLEL COMPUTING, 2021, 104
[5] PHDFS: Optimizing I/O performance of HDFS in deep learning cloud computing platform
Zhu, Zongwei
Tan, Luchao
Li, Yinzhen
Ji, Cheng
JOURNAL OF SYSTEMS ARCHITECTURE, 2020, 109
[6] I/O performance evaluation with Parabench - programmable I/O benchmark
Mordvinova, Olga
Runz, Dennis
Kunkel, Julian M.
Ludwig, Thomas
ICCS 2010 - INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, PROCEEDINGS, 2010, 1 (01): : 2119 - 2128
[7] PARALLEL I/O OPTIMIZATIONS FOR SCALABLE DEEP LEARNING
Pumma, Sarunya
Si, Min
Feng, Wu-chun
Balaji, Pavan
2017 IEEE 23RD INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2017, : 720 - 729
[8] A NEW APPROACH TO I/O PERFORMANCE EVALUATION - SELF-SCALING I/O BENCHMARKS, PREDICTED I/O PERFORMANCE
CHEN, PM
PATTERSON, DA
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1994, 12 (04): : 308 - 339
[9] Evaluation of hyperspectral data for deep learning model performance
Butler, Samantha J.
Price, Stanton R.
Carley, Samantha S.
Land, Haley B.
Price, Steven R.
ALGORITHMS, TECHNOLOGIES, AND APPLICATIONS FOR MULTISPECTRAL AND HYPERSPECTRAL IMAGING XXX, 2024, 13031
[10] Performance Evaluation of Deep Learning Compilers for Edge Inference
Verma, Gaurav
Gupta, Yashi
Malik, Abid M.
Chapman, Barbara
2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 858 - 865

← 1 2 3 4 5 →