AIPerf: Automated Machine Learning as an AI-HPC Benchmark

被引:10
|
作者
Ren, Zhixiang [1 ]
Liu, Yongheng [1 ]
Shi, Tianhui [2 ]
Xie, Lei [2 ]
Zhou, Yue [1 ]
Zhai, Jidong [2 ]
Zhang, Youhui [2 ]
Zhang, Yunquan [3 ]
Chen, Wenguang [2 ]
机构
[1] Peng Cheng Natl Lab, Shenzhen 518000, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Beijing 100086, Peoples R China
关键词
High-Performance Computing (HPC); Artificial Intelligence (AI); automated machine learning; SYSTEMS;
D O I
10.26599/BDMA.2021.9020004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The plethora of complex Artificial Intelligence (AI) algorithms and available High-Performance Computing (HPC) power stimulates the expeditious development of AI components with heterogeneous designs. Consequently, the need for cross-stack performance benchmarking of AI-HPC systems has rapidly emerged. In particular, the de facto HPC benchmark, LINPACK, cannot reflect the AI computing power and input/output performance without a representative workload. Current popular AI benchmarks, such as MLPerf, have a fixed problem size and therefore limited scalability. To address these issues, we propose an end-to-end benchmark suite utilizing automated machine learning, which not only represents real AI scenarios, but also is auto-adaptively scalable to various scales of machines. We implement the algorithms in a highly parallel and flexible way to ensure the efficiency and optimization potential on diverse systems with customizable configurations. We utilize Operations Per Second (OPS), which is measured in an analytical and systematic approach, as a major metric to quantify the AI performance. We perform evaluations on various systems to ensure the benchmark's stability and scalability, from 4 nodes with 32 NVIDIA Tesla T4 (56.1 Tera-OPS measured) up to 512 nodes with 4096 Huawei Ascend 910 (194.53 Peta-OPS measured), and the results show near-linear weak scalability. With a flexible workload and single metric, AIPerf can easily scale on and rank AI-HPC, providing a powerful benchmark suite for the coming supercomputing era.
引用
收藏
页码:208 / 220
页数:13
相关论文
共 50 条
  • [21] Machine learning | ai
    1600, ASM International (179): : 6 - 13
  • [22] MACHINE LEARNING | AI
    Nozari, Vahid
    ADVANCED MATERIALS & PROCESSES, 2021, 179 (08): : 8 - 9
  • [23] MACHINE LEARNING|AI
    不详
    ADVANCED MATERIALS & PROCESSES, 2020, 178 (03): : 8 - 8
  • [24] Automated Coding Using Machine Learning and Remapping the US Nonprofit Sector: A Guide and Benchmark
    Ma, Ji
    NONPROFIT AND VOLUNTARY SECTOR QUARTERLY, 2021, 50 (03) : 662 - 687
  • [25] AI4CITY-An Automated Machine Learning Platform for Smart Cities
    Pereira, Pedro Jose
    Goncalves, Carlos
    Nunes, Lara Lopes
    Cortez, Paulo
    Pilastri, Andre
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 886 - 889
  • [26] Automated machine learning: AI-driven decision making in business analytics
    Schmitt M.
    Intelligent Systems with Applications, 2023, 18
  • [27] Delivering a machine learning course on HPC resources
    Bagnasco, Stefano
    Fronze, Gabriele Gaetano
    Legger, Federica
    Lusso, Stefano
    Vallero, Sara
    24TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2019), 2020, 245
  • [28] A benchmark dataset for machine learning in ecotoxicology
    Schuer, Christoph
    Gasser, Lilian
    Perez-Cruz, Fernando
    Schirmer, Kristin
    Baity-Jesi, Marco
    SCIENTIFIC DATA, 2023, 10 (01)
  • [29] MoleculeNet: a benchmark for molecular machine learning
    Wu, Zhenqin
    Ramsundar, Bharath
    Feinberg, Evan N.
    Gomes, Joseph
    Geniesse, Caleb
    Pappu, Aneesh S.
    Leswing, Karl
    Pande, Vijay
    CHEMICAL SCIENCE, 2018, 9 (02) : 513 - 530
  • [30] A benchmark dataset for machine learning in ecotoxicology
    Christoph Schür
    Lilian Gasser
    Fernando Perez-Cruz
    Kristin Schirmer
    Marco Baity-Jesi
    Scientific Data, 10