Scale-Out vs Scale-Up: A Study of ARM-based SoCs on Server-Class Workloads

被引:5
|
作者
Azimi, Reza [1 ]
Fox, Tyler [1 ]
Gonzalez, Wendy [1 ]
Reda, Sherief [1 ]
机构
[1] Brown Univ, Sch Engn, 184 Hope St, Providence, RI 02906 USA
关键词
ARM computing; GPGPU acceleration; scale-out clusters;
D O I
10.1145/3232162
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
ARM 64-bit processing has generated enthusiasm to develop ARM-based servers that are targeted for both data centers and supercomputers. In addition to the server-class components and hardware advancements, the ARM software environment has grown substantially over the past decade. Major development ecosystems and libraries have been ported and optimized to run on ARM, making ARM suitable for server-class workloads. There are two trends in available ARM SoCs: mobile-class ARM SoCs that rely on the heterogeneous integration of a mix of CPU cores, GPGPU streaming multiprocessors (SMs), and other accelerators, and the server-class SoCs that instead rely on integrating a larger number of CPU cores with no GPGPU support and a number of IO accelerators. For scaling the number of processing cores, there are two different paradigms: mobile-class SoCs that use scale-out architecture in the form of a cluster of simpler systems connected over a network, and server-class ARM SoCs that use the scale-up solution and leverage symmetric multiprocessing to pack a large number of cores on the chip. In this article, we present ScaleSoC cluster, which is a scale-out solution based on mobile class ARM SoCs. ScaleSoC leverages fast network connectivity and GPGPU acceleration to improve performance and energy efficiency compared to previous ARM scale-out clusters. We consider a wide range of modern server-class parallel workloads to study both scaling paradigms, including latency-sensitive transactional workloads, MPI-based CPU and GPGPU-accelerated scientific applications, and emerging artificial intelligence workloads. We study the performance and energy efficiency of ScaleSoC compared to server-class ARM SoCs and discrete GPGPUs in depth. We quantify the network overhead on the performance of ScaleSoC and show that packing a large number of ARM cores on a single chip does not necessarily guarantee better performance, due to the fact that shared resources, such as last-level cache, become performance bottlenecks. We characterize the GPGPU accelerated workloads and demonstrate that for applications that can leverage the better CPU-GPGPU balance of the ScaleSoC cluster, performance and energy efficiency improve compared to discrete GPGPUs.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] OVERCOMING SCALE-UP AND SCALE-OUT WITH AUTOMATION
    Bure, K.
    CYTOTHERAPY, 2013, 15 (04) : S18 - S18
  • [2] Scale-Out vs. Scale-Up Techniques for Cloud Performance and Productivity
    Hwang, Kai
    Shi, Yue
    Bai, Xiaoying
    2014 IEEE 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2014, : 763 - 768
  • [3] Performance Evaluation of In-Memory Computing on Scale-Up and Scale-Out Cluster
    Yoo, Taekyung
    Yim, Minsub
    Jeong, Ilgyun
    Lee, Yunsu
    Chun, Seung-Tae
    2016 EIGHTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2016, : 456 - 461
  • [4] Enabling Technology in Cell-Based Therapies: Scale-Up, Scale-Out, or Program In-Place
    Puleo, C. M.
    Davis, B.
    Smith, R.
    SLAS TECHNOLOGY, 2018, 23 (04): : 299 - 300
  • [5] An Analytical Framework for Estimating Scale-Out and Scale-Up Power Efficiency of Heterogeneous Manycores
    Ma, Jun
    Yan, Guihai
    Han, Yinhe
    Li, Xiaowei
    IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (02) : 367 - 381
  • [6] Performance Measurement on Scale-up and Scale-out Hadoop with Remote and Local File Systems
    Li, Zhuozhao
    Shen, Haiying
    PROCEEDINGS OF 2016 IEEE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2016, : 456 - 463
  • [7] Clearing the Clouds A Study of Emerging Scale-out Workloads on Modern Hardware
    Ferdman, Michael
    Adileh, Almutaz
    Kocberber, Onur
    Volos, Stavros
    Alisafaee, Mohammad
    Jevdjic, Djordje
    Kaynak, Cansu
    Popescu, Adrian Daniel
    Ailamaki, Anastasia
    Falsafi, Babak
    ACM SIGPLAN NOTICES, 2012, 47 (04) : 37 - 47
  • [8] Clearing the Clouds A Study of Emerging Scale-out Workloads on Modern Hardware
    Ferdman, Michael
    Adileh, Almutaz
    Kocberber, Onur
    Volos, Stavros
    Alisafaee, Mohammad
    Jevdjic, Djordje
    Kaynak, Cansu
    Popescu, Adrian Daniel
    Ailamaki, Anastasia
    Falsafi, Babak
    ASPLOS XVII: SEVENTEENTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2012, : 37 - 47
  • [9] Measuring Scale-Up and Scale-Out Hadoop with Remote and Local File Systems and Selecting the Best Platform
    Li, Zhuozhao
    Shen, Haiying
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (11) : 3201 - 3214
  • [10] A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
    Sun, Hao
    Shen, Junzhong
    Zhang, Changwu
    Liu, Hengzhu
    MICROMACHINES, 2025, 16 (03)