Lattice QCD on Intel® Xeon Phi™ Coprocessors

被引:0
|
作者
Joo, Balint [1 ]
Kalamkar, Dhiraj D. [2 ]
Vaidyanathan, Karthikeyan [2 ]
Smelyanskiy, Mikhail [3 ]
Pamnany, Kiran [2 ]
Lee, Victor W. [3 ]
Dubey, Pradeep [3 ]
Watson, William, III [1 ]
机构
[1] Thomas Jefferson Natl Accelerator Facil, Newport News, VA 23606 USA
[2] Intel Corp, Parallel Comp Lab, Bangalore, Karnataka, India
[3] Intel Corp, Parallel Comp Lab, Santa Clara, CA USA
来源
SUPERCOMPUTING (ISC 2013) | 2013年 / 7905卷
关键词
SOLVERS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Lattice Quantum Chromodynamics (LQCD) is currently the only known model independent, non perturbative computational method for calculations in the theory of the strong interactions, and is of importance in studies of nuclear and high energy physics. LQCD codes use large fractions of supercomputing cycles worldwide and are often amongst the first to be ported to new high performance computing architectures. The recently released Intel Xeon Phi architecture from Intel Corporation features parallelism at the level of many x86-based cores, multiple threads per core, and vector processing units. In this contribution, we describe our experiences with optimizing a key LQCD kernel for the Xeon Phi architecture. On a single node, using single precision, our Dslash kernel sustains a performance of up to 320 GFLOPS, while our Conjugate Gradients solver sustains up to 237 GFLOPS. Furthermore we demonstrate a fully ' native' multi-node LQCD implementation running entirely on KNC nodes with minimum involvement of the host CPU. Our multi-node implementation of the solver has been strong scaled to 3.9 TFLOPS on 32 KNCs.
引用
收藏
页码:40 / 54
页数:15
相关论文
共 50 条
  • [1] Effective SIMD Vectorization for Intel Xeon Phi Coprocessors
    Tian, Xinmin
    Saito, Hideki
    Preis, Serguei V.
    Garcia, Eric N.
    Kozhukhov, Sergey S.
    Masten, Matt
    Cherkasov, Aleksei G.
    Panchenko, Nikolay
    [J]. SCIENTIFIC PROGRAMMING, 2015, 2015
  • [2] Communication Models for Distributed Intel Xeon Phi Coprocessors
    Neuwirth, Sarah
    Frey, Dirk
    Bruening, Ulrich
    [J]. 2015 IEEE 21ST INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2015, : 499 - 506
  • [3] Exploring SIMD for Molecular Dynamics, Using Intel®Xeon®Processors and Intel®Xeon Phi™ Coprocessors
    Pennycook, S. J.
    Hughes, C. J.
    Smelyanskiy, M.
    Jarvis, S. A.
    [J]. IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 1085 - 1097
  • [4] MrPhi: An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors
    Lu, Mian
    Liang, Yun
    Huynh Phung Huynh
    Ong, Zhongliang
    He, Bingsheng
    Goh, Rick Siow Mong
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (11) : 3066 - 3078
  • [5] Lattice QCD with Domain Decomposition on Intel® Xeon Phi™ Co-Processors
    Heybrock, Simon
    Joo, Balint
    Kalamkar, Dhiraj D.
    Smelyanskiy, Mikhail
    Vaidyanathan, Karthikeyan
    Wettig, Tilo
    Dubey, Pradeep
    [J]. SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 69 - 80
  • [6] Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing
    Kanamori, Issaku
    Matsufuru, Hideo
    [J]. 2017 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2017, : 375 - 381
  • [7] Accelerating the Pace of Protein Functional Annotation With Intel Xeon Phi Coprocessors
    Feinstein, Wei P.
    Moreno, Juana
    Jarrell, Mark
    Brylinski, Michal
    [J]. IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2015, 14 (04) : 429 - 439
  • [8] Beacon: Deployment and Application of Intel Xeon Phi Coprocessors for Scientific Computing
    Brook, R. Glenn
    Heinecke, Alexander
    Costa, Anthony B.
    Peitz, Paul, Jr.
    Betro, Vincent C.
    Baer, Troy
    Bader, Michael
    Dubey, Pradeep
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2015, 17 (02) : 65 - 72
  • [9] Optimizing Non-Contiguous Memory Access on Intel Xeon Phi Coprocessors
    Ma, Mingfei
    Hou, Jinlong
    Ye, Jason
    Arunachalam, Meena
    Gutierrez, Rafael
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1615 - 1620
  • [10] Full Correlation Matrix Analysis of fMRI Data on Intel® Xeon Phi™ Coprocessors
    Wang, Yida
    Anderson, Michael J.
    Cohen, Jonathan D.
    Heinecke, Alexander
    Li, Kai
    Satish, Nadathur
    Sundaram, Narayanan
    Turk-Browne, Nicholas B.
    Willke, Theodore L.
    [J]. PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2015,