Production Deployment of Machine-Learned Rotorcraft Surrogate Models on HPC

被引:3
|
作者
Brewer, Wesley [1 ]
Martinez, Daniel [2 ]
Boyer, Mathew [1 ]
Jude, Dylan [3 ]
Wissink, Andy [3 ]
Parsons, Ben [4 ]
Yin, Junqi [5 ]
Anantharaj, Valentine [5 ]
机构
[1] DoD HPCMP PET GDIT, Vicksburg, MS 39335 USA
[2] Sci & Technol Corp, Moffett Field, CA USA
[3] US Army DEVCOM AvMC DSE, Moffett Field, CA USA
[4] DoD HPCMP, Vicksburg, MS USA
[5] Oak Ridge Leadership Comp Facil, Oak Ridge, TN USA
关键词
surrogate; inference; production; HPC;
D O I
10.1109/MLHPC54614.2021.00008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore how to optimally deploy several different types of machine-learned surrogate models used in rotorcraft aerodynamics on HPC. We first developed three different rotorcraft models at three different orders of magnitude (2M, 44M, and 212M trainable parameters) to use as test models. Then we developed a benchmark, which we call "smiBench", that uses synthetic data to test a wide range of alternative configurations to study optimal deployment scenarios. We discovered several different types of optimal deployment scenarios depending on the model size and inference frequency. For most cases, it makes sense to use multiple inference servers, each bound to a GPU with a load balancer distributing the requests across multiple GPUs. We tested three different types of inference server deployments: (1) a custom Flask-based HTTP inference server, (2) TensorFlow Serving with gRPC protocol, and (3) RedisAI server with SmartRedis clients using the RESP protocol. We also tested three different types of load balancing techniques for multiGPU inferencing: (1) Python concurrent.futures thread pool, (2) HAProxy, and (3) mpi4py. We investigated deployments on both DoD HPCMP's SCOUT and DoE OLCF's Summit POWER9 supercomputers, demonstrated the ability to inference a million samples per second using 192 GPUs, and studied multiple scenarios on both Nvidia T4 and V100 GPUs. Moreover, we studied a range of concurrency levels, both on the client-side and the server-side, and provide optimal configuration advice based on the type of deployment. Finally, we provide a simple Pythonbased framework for benchmarking machine-learned surrogate models using the various inference servers.
引用
收藏
页码:21 / 32
页数:12
相关论文
共 50 条
  • [31] Machine-learned prediction of the electronic fields in a crystal
    Teh, Ying Shi
    Ghosh, Swarnava
    Bhattacharya, Kaushik
    [J]. MECHANICS OF MATERIALS, 2021, 163
  • [32] Machine-learned exclusion limits without binning
    Arganda, Ernesto
    Perez, Andres D.
    de los Rios, Martin
    Sanda Seoane, Rosa Maria
    [J]. EUROPEAN PHYSICAL JOURNAL C, 2023, 83 (12):
  • [33] AMALEU: A Machine-Learned Universal Language Representation
    Costa-jussa, Marta R.
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2020, (65): : 105 - 108
  • [34] Machine-learned exclusion limits without binning
    Ernesto Arganda
    Andres D. Perez
    Martín de los Rios
    Rosa María Sandá Seoane
    [J]. The European Physical Journal C, 83
  • [35] English morphological analysis with machine-learned rules
    Tang, Xuri
    [J]. PACLIC 20: Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation, 2006, : 35 - 41
  • [36] Machine-learned potentials for eucryptite: A systematic comparison
    Jörg-Rüdiger Hill
    Wolfgang Mannstadt
    [J]. Journal of Materials Research, 2023, 38 : 5188 - 5197
  • [37] GENERALIZATION OF MACHINE-LEARNED TURBULENT HEAT FLUX MODELS APPLIED TO FILM COOLING FLOWS
    Milani, Pedro M.
    Ling, Julia
    Eaton, John K.
    [J]. PROCEEDINGS OF THE ASME TURBO EXPO: TURBOMACHINERY TECHNICAL CONFERENCE AND EXPOSITION, 2019, VOL 5A, 2019,
  • [38] Machine-learned acceleration for molecular dynamics in CASTEP
    Stenczel, Tamas K.
    El-Machachi, Zakariya
    Liepuoniute, Guoda
    Morrow, Joe D.
    Bartok, Albert P.
    Probert, Matt I. J.
    Csanyi, Gabor
    Deringer, Volker L.
    [J]. JOURNAL OF CHEMICAL PHYSICS, 2023, 159 (04):
  • [39] How to validate machine-learned interatomic potentials
    Morrow, Joe D.
    Gardner, John L. A.
    Deringer, Volker L.
    [J]. JOURNAL OF CHEMICAL PHYSICS, 2023, 158 (12):
  • [40] Generalization of Machine-Learned Turbulent Heat Flux Models Applied to Film Cooling Flows
    Milani, Pedro M.
    Ling, Julia
    Eaton, John K.
    [J]. JOURNAL OF TURBOMACHINERY-TRANSACTIONS OF THE ASME, 2020, 142 (01):