AIPerf: Automated Machine Learning as an AI-HPC Benchmark

被引:10
|
作者
Ren, Zhixiang [1 ]
Liu, Yongheng [1 ]
Shi, Tianhui [2 ]
Xie, Lei [2 ]
Zhou, Yue [1 ]
Zhai, Jidong [2 ]
Zhang, Youhui [2 ]
Zhang, Yunquan [3 ]
Chen, Wenguang [2 ]
机构
[1] Peng Cheng Natl Lab, Shenzhen 518000, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Beijing 100086, Peoples R China
关键词
High-Performance Computing (HPC); Artificial Intelligence (AI); automated machine learning; SYSTEMS;
D O I
10.26599/BDMA.2021.9020004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The plethora of complex Artificial Intelligence (AI) algorithms and available High-Performance Computing (HPC) power stimulates the expeditious development of AI components with heterogeneous designs. Consequently, the need for cross-stack performance benchmarking of AI-HPC systems has rapidly emerged. In particular, the de facto HPC benchmark, LINPACK, cannot reflect the AI computing power and input/output performance without a representative workload. Current popular AI benchmarks, such as MLPerf, have a fixed problem size and therefore limited scalability. To address these issues, we propose an end-to-end benchmark suite utilizing automated machine learning, which not only represents real AI scenarios, but also is auto-adaptively scalable to various scales of machines. We implement the algorithms in a highly parallel and flexible way to ensure the efficiency and optimization potential on diverse systems with customizable configurations. We utilize Operations Per Second (OPS), which is measured in an analytical and systematic approach, as a major metric to quantify the AI performance. We perform evaluations on various systems to ensure the benchmark's stability and scalability, from 4 nodes with 32 NVIDIA Tesla T4 (56.1 Tera-OPS measured) up to 512 nodes with 4096 Huawei Ascend 910 (194.53 Peta-OPS measured), and the results show near-linear weak scalability. With a flexible workload and single metric, AIPerf can easily scale on and rank AI-HPC, providing a powerful benchmark suite for the coming supercomputing era.
引用
收藏
页码:208 / 220
页数:13
相关论文
共 50 条
  • [1] AIPerf: Automated Machine Learning as an AI-HPC Benchmark
    Zhixiang Ren
    Yongheng Liu
    Tianhui Shi
    Lei Xie
    Yue Zhou
    Jidong Zhai
    Youhui Zhang
    Yunquan Zhang
    Wenguang Chen
    Big Data Mining and Analytics, 2021, 4 (03) : 208 - 220
  • [2] MLPerf™ HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
    Farrell, Steven
    Emani, Murali
    Balma, Jacob
    Drescher, Lukas
    Drozd, Aleksandr
    Fink, Andreas
    Fox, Geoffrey
    Kanter, David
    Kurth, Thorsten
    Mattson, Peter
    Mu, Dawei
    Ruhela, Amit
    Sato, Kento
    Shirahata, Koichi
    Tabaru, Tsuguchika
    Tsaris, Aristeidis
    Balewski, Jan
    Cumming, Ben
    Danjo, Takumi
    Domke, Jens
    Fukai, Takaaki
    Fukumoto, Naoto
    Fukushi, Tatsuya
    Gerofi, Balazs
    Honda, Takumi
    Imamura, Toshiyuki
    Kasagi, Akihiko
    Kawakami, Kentaro
    Kudo, Shuhei
    Kuroda, Akiyoshi
    Martinasso, Maxime
    Matsuoka, Satoshi
    Mendonca, Henrique
    Minami, Kazuki
    Ram, Prabhat
    Sawada, Takashi
    Shankar, Mallikarjun
    St John, Tom
    Tabuchi, Akihiro
    Vishwanath, Venkatram
    Wahib, Mohamed
    Yamazaki, Masafumi
    Yin, Junqi
    PROCEEDINGS OF THE WORKSHOP ON MACHINE LEARNING IN HIGH PERFORMANCE COMPUTING ENVIRONMENTS (MLHPC 2021), 2021, : 33 - 45
  • [3] Benchmark and Survey of Automated Machine Learning Frameworks
    Zoeller, Marc-Andre
    Huber, Marco F.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2021, 70 : 409 - 472
  • [4] Benchmark and Survey of Automated Machine Learning Frameworks
    Zöller M.-A.
    Huber M.F.
    Journal of Artificial Intelligence Research, 2021, 70 : 409 - 472
  • [5] Automated Performance Modeling of HPC Applications Using Machine Learning
    Sun, Jingwei
    Sun, Guangzhong
    Zhan, Shiyan
    Zhang, Jiepeng
    Chen, Yong
    IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (05) : 749 - 763
  • [6] From Zero to AI Hero with Automated Machine Learning
    Umamahesan, Aniththa
    Babu, Deepak Mukunthu Iyappan
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3495 - 3495
  • [7] Applying Machine Learning and AI on Self Automated Personalized Online Learning
    Srisa-An, Chetneti
    Yongsiriwit, Karn
    FUZZY SYSTEMS AND DATA MINING V (FSDM 2019), 2019, 320 : 137 - 145
  • [8] Automated transition metal catalysts discovery and optimisation with AI and Machine Learning
    Mace, Samuel
    Xu, Yingjian
    Nguyen, Bao N.
    CHEMCATCHEM, 2024, 16 (10)
  • [9] TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems
    Bruecke, Christoph
    Haertling, Philipp
    Palacios, Rodrigo D. Escobar
    Patel, Hamesh
    Rabl, Tilmann
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 3649 - 3661
  • [10] Improving Automated Machine-Learning Systems through Green AI
    Castellanos-Nieves, Dagoberto
    Garcia-Forte, Luis
    APPLIED SCIENCES-BASEL, 2023, 13 (20):