Automatic Core Specialization for AVX-512 Applications

被引:7
|
作者
Gottschlag, Mathias [1 ]
Brantsch, Peter [1 ]
Bellosa, Frank [1 ]
机构
[1] Karlsruhe Inst Technol, Karlsruhe, Germany
关键词
AVX-512; core specialization; dim silicon;
D O I
10.1145/3383669.3398282
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Advanced Vector Extension (AVX) instructions operate on wide SIMD vectors. Due to the resulting high power consumption, recent Intel processors reduce their frequency when executing complex AVX2 and AVX-512 instructions. Following non-AVX code is slowed down by this frequency reduction in two situations: When it executes on the sibling hyperthread of the same core in parallel or - as restoring the non-AVX frequency is delayed - when it directly follows the AVX2/AVX-512 code. As a result, heterogeneous workloads consisting of AVX-512 and non-AVX code are frequently slowed down by 10% on average. In this work, we describe a method to mitigate the frequency reduction slowdown for workloads involving AVX-512 instructions in both situations. Our approach employs core specialization and partitions the CPU cores into AVX-512 cores and non-AVX-512 cores, and only the former execute AVX-512 instructions so that the impact of potential frequency reductions is limited to those cores. To migrate threads to AVX-512 cores, we configure the non-AVX-512 cores to raise an exception when executing AVX-512 instructions. We use a heuristic to determine when to migrate threads back to non-AVX-512 cores. Our approach is able to reduce the frequency reduction overhead by 70% for an assortment of common benchmarks.
引用
收藏
页码:25 / 35
页数:11
相关论文
共 50 条
  • [21] Fast Multiple-Precision Integer Division Using Intel AVX-512
    Edamatsu, Takuya
    Takahashi, Daisuke
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (01) : 224 - 236
  • [22] An implementation of matrix-matrix multiplication on the Intel KNL processor with AVX-512
    Lim, Roktaek
    Lee, Yeongha
    Kim, Raehyun
    Choi, Jaeyoung
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (04): : 1785 - 1795
  • [23] Combining Algorithmic Rethinking and AVX-512 Intrinsics for Efficient Simulation of Subcellular Calcium Signaling
    Jarvis, Chad
    Lines, Glenn Terje
    Langguth, Johannes
    Nakajima, Kengo
    Cai, Xing
    COMPUTATIONAL SCIENCE - ICCS 2019, PT V, 2019, 11540 : 681 - 687
  • [24] A new AXT format for an efficient SpMV product using AVX-512 instructions and CUDA
    Coronado-Barrientos, E.
    Antonioletti, M.
    Garcia-Loureiro, A.
    ADVANCES IN ENGINEERING SOFTWARE, 2021, 156
  • [25] Vectorization of High-performance Scientific Calculations Using AVX-512 Intruction Set
    B. M. Shabanov
    A. A. Rybakov
    S. S. Shumilin
    Lobachevskii Journal of Mathematics, 2019, 40 : 580 - 598
  • [26] An Implementation of Parallel Number-Theoretic Transform Using Intel AVX-512 Instructions
    Takahashi, Daisuke
    COMPUTER ALGEBRA IN SCIENTIFIC COMPUTING (CASC 2022), 2022, 13366 : 318 - 332
  • [27] Optimizing parallel GEMM routines using auto-tuning with Intel AVX-512
    Kim, Raehyun
    Choi, Jaeyoung
    Lee, Myungho
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION (HPC ASIA 2019), 2019, : 101 - 110
  • [28] Vectorized Parallel Sparse Matrix-Vector Multiplication in PETSc Using AVX-512
    Zhang, Hong
    Mills, Richard T.
    Rupp, Karl
    Smith, Barry F.
    PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
  • [29] Hydrogen-helium chemical and nuclear galaxy collision: Hydrodynamic simulations on AVX-512 supercomputers
    Chernykh, Igor
    Kulikov, Igor
    Tutukov, Alexander
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2021, 391 (391)
  • [30] Vectorization of High-performance Scientific Calculations Using AVX-512 Intruction Set
    Shabanov, B. M.
    Rybakov, A. A.
    Shumilin, S. S.
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2019, 40 (05) : 580 - 598