Benchmarking Data Analysis and Machine Learning Applications on the Intel KNL Many-Core Processor

被引:0
|
作者
Byun, Chansup [1 ]
Kepner, Jeremy [1 ]
Arcand, William [1 ]
Bestor, David [1 ]
Bergeron, Bill [1 ]
Gadepally, Vijay [1 ]
Houle, Michael [1 ]
Hubbell, Matthew [1 ]
Jones, Michael [1 ]
Klein, Anna [1 ]
Michaleas, Peter [1 ]
Milechin, Lauren [1 ]
Mullen, Julie [1 ]
Prout, Andrew [1 ]
Rosa, Antonio [1 ]
Samsi, Siddharth [1 ]
Yee, Charles [1 ]
Reuther, Albert [1 ]
机构
[1] MIT, Lincoln Lab, Supercomp Ctr, 244 Wood St, Lexington, MA 02173 USA
关键词
Benchmark; MATLAB; Octave; DGEMM; throughput; performance; machine learning; Caffe; Haswell; Knights Landing;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by similar to 3.5x compared to prior Intel Xeon technologies. Our data analysis applications also achieved similar to 60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7x improvement on a KNL node.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] On synchronization and evaluation method of chipped many-core processor
    Xu W.-Z.
    Song F.-L.
    Liu Z.-Y.
    Fan D.-R.
    Yu L.
    Zhang S.
    Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (10): : 1777 - 1787
  • [32] Discovery of Time Series Motifs on Intel Many-Core Systems
    Zymbler, M. L.
    Kraeva, Ya. A.
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2019, 40 (12) : 2124 - 2132
  • [33] Characterizing and optimizing Java']Java-based HPC applications on Intel many-core architecture
    Yu, Yang
    Lei, Tianyang
    Chen, Haibo
    Zang, Binyu
    SCIENCE CHINA-INFORMATION SCIENCES, 2017, 60 (12)
  • [34] Acceleration of ensemble machine learning methods using many-core devices
    Tamerus, A.
    Washbrook, A.
    Wyeth, D.
    21ST INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP2015), PARTS 1-9, 2015, 664
  • [35] Empirical Analysis of the I/O Characteristics of a Highly Integrated Many-Core Processor
    Lee, Cheongjun
    Lee, Jaehwan
    Koo, Donghun
    Kim, Chungyong
    Bang, Jiwoo
    Byun, Eun-Kyu
    Eom, Hyeonsang
    2020 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS COMPANION (ACSOS-C 2020), 2020, : 1 - 6
  • [36] Towards optimized tensor code generation for deep learning on sunway many-core processor
    Li, Mingzhen
    Liu, Changxi
    Liao, Jianjin
    Zheng, Xuegui
    Yang, Hailong
    Sun, Rujun
    Xu, Jun
    Gan, Lin
    Yang, Guangwen
    Luan, Zhongzhi
    Qian, Depei
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (02)
  • [37] Towards optimized tensor code generation for deep learning on sunway many-core processor
    Mingzhen Li
    Changxi Liu
    Jianjin Liao
    Xuegui Zheng
    Hailong Yang
    Rujun Sun
    Jun Xu
    Lin Gan
    Guangwen Yang
    Zhongzhi Luan
    Depei Qian
    Frontiers of Computer Science, 2024, 18
  • [38] Response Time Analysis of Dataflow Applications on a Many-Core Processor with Shared-Memory and Network-on-Chip
    Graillat, Amaury
    Maiza, Claire
    Moy, Matthieu
    Raymond, Pascal
    de Dinechin, Benoit Dupont
    RTNS 2019: PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON REAL-TIME NETWORKS AND SYSTEMS (RTNS 2019), 2020, : 61 - 69
  • [39] Methodologies for the WCET Analysis of Parallel Applications on Many-core Architectures
    Nelis, Vincent
    Yomsi, Patrick Meumeu
    Pinho, Luis Miguel
    2015 EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2015, : 748 - 755
  • [40] Nanosatellite On-Board Computer including a Many-Core Processor
    Pancher, Fabrice
    Vargas, Vanessa
    Ramos, Pablo
    Bastos, Rodrigo Possamai
    Saravia, David Cesar Ardiles
    Velazco, Raoul
    2021 IEEE 22ND LATIN AMERICAN TEST SYMPOSIUM (LATS2021), 2021,