Benchmarking Data Analysis and Machine Learning Applications on the Intel KNL Many-Core Processor

被引：0

作者：

Byun, Chansup ^{[1
]}

Kepner, Jeremy ^{[1
]}

Arcand, William ^{[1
]}

Bestor, David ^{[1
]}

Bergeron, Bill ^{[1
]}

Gadepally, Vijay ^{[1
]}

Houle, Michael ^{[1
]}

Hubbell, Matthew ^{[1
]}

Jones, Michael ^{[1
]}

Klein, Anna ^{[1
]}

Michaleas, Peter ^{[1
]}

Milechin, Lauren ^{[1
]}

Mullen, Julie ^{[1
]}

Prout, Andrew ^{[1
]}

Rosa, Antonio ^{[1
]}

Samsi, Siddharth ^{[1
]}

Yee, Charles ^{[1
]}

Reuther, Albert ^{[1
]}

机构：

[1] MIT, Lincoln Lab, Supercomp Ctr, 244 Wood St, Lexington, MA 02173 USA

来源：

2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC) | 2017年

关键词：

Benchmark; MATLAB; Octave; DGEMM; throughput; performance; machine learning; Caffe; Haswell; Knights Landing;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by similar to 3.5x compared to prior Intel Xeon technologies. Our data analysis applications also achieved similar to 60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7x improvement on a KNL node.

引用

页数：6

共 50 条

[21] Benchmarking Molecular Dynamics with OpenCL on Many-Core Architectures
Halver, Rene
Homberg, Wilhelm
Sutmann, Godehard
PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2017), PT II, 2018, 10778 : 244 - 253
[22] Optimizing Machine Learning Algorithms on Multi-core and Many-core Architectures using Thread and Data Mapping
Serpa, Matheus S.
Krause, Arthur M.
Cruz, Eduardo H. M.
Navaux, Philippe O. A.
Pasin, Marcelo
Felber, Pascal
2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 329 - 333
[23] Data Criticality in Multithreaded Applications: An Insight for Many-Core Systems
Das, Abhijit
Jose, John
Mishra, Prabhat
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2021, 29 (09) : 1675 - 1679
[24] Discovery of Time Series Motifs on Intel Many-Core Systems
M. L. Zymbler
Ya. A. Kraeva
Lobachevskii Journal of Mathematics, 2019, 40 : 2124 - 2132
[25] A Study of Main-Memory Hash Joins on Many-core Processor: A Case with Intel Knights Landing Architecture
Cheng, Xuntao
He, Bingsheng
Du, Xiaoli
Lau, Chiew Tong
CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 657 - 666
[26] Accelerating Lattice QCD on Sunway Many-core Processor
Zhang Zengxiao
Luan Zhongzhi
Xu Chongyang
Gong Ming
Xu Shun
2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 605 - 612
[27] Parallel Image Processing on the Sunway Many-core Processor
Zhao, Meiting
Liu, Rui
Liu, Yi
Song, Kaida
Qian, Depei
PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 679 - 686
[28] Accelerating Dynamic Itemset Counting on Intel Many-core Systems
Zymbler, Mikhail
2017 40TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2017, : 1343 - 1348
[29] Time Series Discord Discovery on Intel Many-Core Systems
Zymbler, Mikhail
Polyakov, Andrey
Kipnis, Mikhail
PARALLEL COMPUTATIONAL TECHNOLOGIES, PCT 2019, 2019, 1063 : 168 - 182
[30] Parallel deblocking filter for HEVC on many-core processor
Yan, Chenggang
Zhang, Yongdong
Dai, Feng
Wang, Xi
Li, Liang
Dai, Qionghai
ELECTRONICS LETTERS, 2014, 50 (05) : 367 - +

← 1 2 3 4 5 →