Automatic Core Specialization for AVX-512 Applications

被引:7
|
作者
Gottschlag, Mathias [1 ]
Brantsch, Peter [1 ]
Bellosa, Frank [1 ]
机构
[1] Karlsruhe Inst Technol, Karlsruhe, Germany
关键词
AVX-512; core specialization; dim silicon;
D O I
10.1145/3383669.3398282
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Advanced Vector Extension (AVX) instructions operate on wide SIMD vectors. Due to the resulting high power consumption, recent Intel processors reduce their frequency when executing complex AVX2 and AVX-512 instructions. Following non-AVX code is slowed down by this frequency reduction in two situations: When it executes on the sibling hyperthread of the same core in parallel or - as restoring the non-AVX frequency is delayed - when it directly follows the AVX2/AVX-512 code. As a result, heterogeneous workloads consisting of AVX-512 and non-AVX code are frequently slowed down by 10% on average. In this work, we describe a method to mitigate the frequency reduction slowdown for workloads involving AVX-512 instructions in both situations. Our approach employs core specialization and partitions the CPU cores into AVX-512 cores and non-AVX-512 cores, and only the former execute AVX-512 instructions so that the impact of potential frequency reductions is limited to those cores. To migrate threads to AVX-512 cores, we configure the non-AVX-512 cores to raise an exception when executing AVX-512 instructions. We use a heuristic to determine when to migrate threads back to non-AVX-512 cores. Our approach is able to reduce the frequency reduction overhead by 70% for an assortment of common benchmarks.
引用
收藏
页码:25 / 35
页数:11
相关论文
共 50 条
  • [41] FastModular Squaring with AVX512IFMA
    Drucker, Nir
    Gueron, Shay
    16TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY-NEW GENERATIONS (ITNG 2019), 2019, 800 : 3 - 8
  • [42] Optimizing Dilithium Implementation with AVX2/-512
    Xu, Runqing
    He, Debiao
    Luo, Min
    Peng, Cong
    Zeng, Xiangyong
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (06)
  • [43] CCF: An efficient SpMV storage format for AVX512 platforms
    Almasri, Mohammad
    Abu-Sufah, Walid
    PARALLEL COMPUTING, 2020, 100
  • [44] Evolving AVX512 Parallel C Code Using GP
    Langdon, William B.
    Lorenz, Ronny
    GENETIC PROGRAMMING, EUROGP 2019, 2019, 11451 : 245 - 261
  • [45] 基于AVX512的格密码高速并行实现
    雷斗威
    何德彪
    罗敏
    彭聪
    计算机工程, 2024, 50 (02) : 15 - 24
  • [46] Accelerating Large Integer Multiplication Using Intel AVX-512IFMA
    Edamatsu, Takuya
    Takahashi, Daisuke
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING (ICA3PP 2019), PT I, 2020, 11944 : 60 - 74
  • [47] Fast Multiple Montgomery Multiplications Using Intel AVX-512IFMA Instructions
    Takahashi, Daisuke
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2020, PT V, 2020, 12253 : 655 - 663
  • [48] Faster Implementation of Ideal Lattice-Based Cryptography Using AVX512
    Lei, Douwei
    He, Debiao
    Peng, Cong
    Luo, Min
    Liu, Zhe
    Huang, Xinyi
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (05)
  • [49] The RACECAR Heuristic for Automatic Function Specialization on Multi-core Heterogeneous Systems
    Wernsing, John Robert
    Stitt, Greg
    Fowers, Jeremy
    CASES'12: PROCEEDINGS OF THE 2012 ACM INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS, 2012, : 81 - 90
  • [50] RACECAR: A Heuristic for Automatic Function Specialization on Multi-core Heterogeneous Systems
    Wernsing, John R.
    Stitt, Greg
    ACM SIGPLAN NOTICES, 2012, 47 (08) : 321 - 322