Lattice QCD on Intel® Xeon Phi™ Coprocessors

被引:0
|
作者
Joo, Balint [1 ]
Kalamkar, Dhiraj D. [2 ]
Vaidyanathan, Karthikeyan [2 ]
Smelyanskiy, Mikhail [3 ]
Pamnany, Kiran [2 ]
Lee, Victor W. [3 ]
Dubey, Pradeep [3 ]
Watson, William, III [1 ]
机构
[1] Thomas Jefferson Natl Accelerator Facil, Newport News, VA 23606 USA
[2] Intel Corp, Parallel Comp Lab, Bangalore, Karnataka, India
[3] Intel Corp, Parallel Comp Lab, Santa Clara, CA USA
来源
SUPERCOMPUTING (ISC 2013) | 2013年 / 7905卷
关键词
SOLVERS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Lattice Quantum Chromodynamics (LQCD) is currently the only known model independent, non perturbative computational method for calculations in the theory of the strong interactions, and is of importance in studies of nuclear and high energy physics. LQCD codes use large fractions of supercomputing cycles worldwide and are often amongst the first to be ported to new high performance computing architectures. The recently released Intel Xeon Phi architecture from Intel Corporation features parallelism at the level of many x86-based cores, multiple threads per core, and vector processing units. In this contribution, we describe our experiences with optimizing a key LQCD kernel for the Xeon Phi architecture. On a single node, using single precision, our Dslash kernel sustains a performance of up to 320 GFLOPS, while our Conjugate Gradients solver sustains up to 237 GFLOPS. Furthermore we demonstrate a fully ' native' multi-node LQCD implementation running entirely on KNC nodes with minimum involvement of the host CPU. Our multi-node implementation of the solver has been strong scaled to 3.9 TFLOPS on 32 KNCs.
引用
收藏
页码:40 / 54
页数:15
相关论文
共 50 条
  • [31] Understanding Data Analytics Workloads on Intel®Xeon Phi™
    Xie, Biwei
    Liu, Xu
    Mckee, Sally A.
    Zhan, Jianfeng
    Jia, Zhen
    Wang, Lei
    Zhang, Lixin
    [J]. PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 206 - 215
  • [32] Intel Xeon Phi Coprocessor High Performance Programming
    More, Andres
    [J]. JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2013, 13 (02): : 105 - 106
  • [33] Offload Compiler Runtime for the Intel® Xeon Phi™ Coprocessor
    Newburn, Chris J.
    Deodhar, Rajiv
    Dmitriev, Serguei
    Murty, Ravi
    Narayanaswamy, Ravi
    Wiegert, John
    Chinchilla, Francisco
    McGuire, Russell
    [J]. SUPERCOMPUTING (ISC 2013), 2013, 7905 : 239 - 254
  • [34] A survey on evaluating and optimizing performance of Intel Xeon Phi
    Mittal, Sparsh
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (19):
  • [35] Intel® Xeon Phi™ coprocessor (codename Knights Corner)
    Chrysos, George
    [J]. 2012 IEEE HOT CHIPS 24 SYMPOSIUM (HCS), 2012,
  • [36] Implementing Central Force Optimization on the Intel Xeon Phi
    Charest, Thomas
    Green, Robert C.
    [J]. 2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2020), 2020, : 502 - 511
  • [37] Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor
    Lu, Mian
    Zhang, Lei
    Huynh Phung Huynh
    Ong, Zhongliang
    Liang, Yun
    He, Bingsheng
    Goh, Rick Siow Mong
    Richard Huynh
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [38] Effective Barrier Synchronization on Intel Xeon Phi Coprocessor
    Rodchenko, Andrey
    Nisbet, Andy
    Pop, Antoniu
    Lujan, Mikel
    [J]. EURO-PAR 2015: PARALLEL PROCESSING, 2015, 9233 : 588 - 600
  • [39] HPC on the Intel Xeon Phi: Homomorphic Word Searching
    Martins, Paulo
    Sousa, Leonel
    [J]. HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 75 - 88
  • [40] Retargeting of the Open Community Runtime to Intel Xeon Phi
    Dokulil, Jiri
    Benkner, Siegfried
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 1453 - 1462