Benchmarking Data Analysis and Machine Learning Applications on the Intel KNL Many-Core Processor

被引:0
|
作者
Byun, Chansup [1 ]
Kepner, Jeremy [1 ]
Arcand, William [1 ]
Bestor, David [1 ]
Bergeron, Bill [1 ]
Gadepally, Vijay [1 ]
Houle, Michael [1 ]
Hubbell, Matthew [1 ]
Jones, Michael [1 ]
Klein, Anna [1 ]
Michaleas, Peter [1 ]
Milechin, Lauren [1 ]
Mullen, Julie [1 ]
Prout, Andrew [1 ]
Rosa, Antonio [1 ]
Samsi, Siddharth [1 ]
Yee, Charles [1 ]
Reuther, Albert [1 ]
机构
[1] MIT, Lincoln Lab, Supercomp Ctr, 244 Wood St, Lexington, MA 02173 USA
关键词
Benchmark; MATLAB; Octave; DGEMM; throughput; performance; machine learning; Caffe; Haswell; Knights Landing;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by similar to 3.5x compared to prior Intel Xeon technologies. Our data analysis applications also achieved similar to 60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7x improvement on a KNL node.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Benchmarking SW26010 Many-core Processor
    Xu, Zhigeng
    Lin, James
    Matsuoka, Satoshi
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 743 - 752
  • [2] Emulating Asymmetric MPSoCs on the Intel SCC Many-core Processor
    Bakker, Roy
    van Tol, Michiel W.
    Pimentel, Andy D.
    2014 22ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2014), 2014, : 520 - 527
  • [3] Design and Analysis of a Many-Core Processor Architecture for Multimedia Applications
    Lai, Jyu-Yuan
    Chen, Po-Yu
    Hsu, Ting-Shuo
    Huang, Chih-Tsun
    Liou, Jing-Jia
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [4] Parallel simulation of many-core processor and many-core clusters
    Lü, Huiwei
    Cheng, Yuan
    Bai, Lu
    Chen, Mingyu
    Fan, Dongrui
    Sun, Ninghui
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2013, 50 (05): : 1110 - 1117
  • [5] Design of A Scalable Many-Core Processor for Embedded Applications
    Chien, Hsiao-Wei
    Lai, Jyun-Long
    Wu, Chao-Chieh
    Huang, Chih-Tsun
    Hsu, Ting-Shuo
    Liou, Jing-Jia
    2015 20TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2015, : 24 - 25
  • [6] A Many-core Parallelizing Processor
    Porada, Katarzyna
    2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 875 - 877
  • [7] New system software for parallel programming models on the Intel SCC many-core processor
    Clauss, Carsten
    Lankes, Stefan
    Reble, Pablo
    Bemmerl, Thomas
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (09): : 2235 - 2259
  • [8] Response Time Analysis of Synchronous Data Flow Programs on a Many-Core Processor
    Rihani, Hamza
    Moy, Matthieu
    Maiza, Claire
    Davis, Robert I.
    Altmeyer, Sebastian
    PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON REAL-TIME NETWORKS AND SYSTEMS PROCEEDINGS (RTNS 2016), 2016, : 67 - 76
  • [9] Detection and Analysis of Congestion of Nodes in Many-Core Processor
    Abraham, Nishin Jude C.
    Radha, D.
    FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR COMPUTATIONAL INTELLIGENCE, 2020, 1045 : 755 - 768
  • [10] Federated Learning Platform on Embedded Many-core Processor with Flower
    Hasumi, Masahiro
    Azumi, Takuya
    2024 IEEE 3RD REAL-TIME AND INTELLIGENT EDGE COMPUTING WORKSHOP, RAGE 2024, 2024, : 37 - 42