Multicore processors and GPUs: the power of parallel computing in the Cloud

被引：0

作者：

Bennett, Kelly W. ^{[1
]}

Robertson, James ^{[2
]}

机构：

[1] US Army Res Lab, Sensors & Electron Devices Directorate, Adelphi, MD 20783 USA

[2] Clearhaven Technol LLC, Severna Pk, MD 21146 USA

来源：

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS II | 2020年 / 11413卷

关键词：

Amazon Web Services; Cloud Computing; AL/ML algorithms; Cloud-based GPU platform; Big data; Disruptive Computing Technologies; AWS SageMaker;

D O I：

10.1117/12.2558600

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sensors used in intelligence, surveillance and reconnaissance (ISR) operations and activities have the ability to generate vast amounts of data. High-volume analytical capabilities are needed to process data from multi-modal sensors to develop and test complex computational and deep learning models in support of the U.S. Army Multi-Domain Operations (MDO). The Army Research Laboratory designs, develops and tests Artificial Intelligence and Machine Learning (AI/ML) algorithms employing large repositories of in-house data. To efficiently process the data as well as design, build, train and deploy models, parallel and distributed algorithms are needed. Deep learning frameworks provide language-specific, container-based building blocks associated with deep learning neural networks applied to specific target applications. This paper discusses applications of AI/ML deep learning frameworks and Software Development Kits (SDKs) and demonstrates and compares specific multi-core processor and NVidia Graphics Processing Unit (GPU) implementations for desktop and Cloud environments. Frameworks selected for this research include PyTorch and Matlab. Amazon Web Services (AWS) SageMaker was used to launch Machine Learning instances ranging from general purpose computing to GPU instances. Detailed processes, example code, performance enhancements, best practices and lessons learned are included for publicly available acoustic and image datasets. Research results indicate parallel implementations of data preprocessing steps saved significant time but more expensive GPUs did not provide any processing time advantages for the machine learning algorithms tested.

引用

页数：13

共 50 条

[1] Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL
Ferrer, Roger
Planas, Judit
Bellens, Pieter
Duran, Alejandro
Gonzalez, Marc
Martorell, Xavier
Badia, Rosa M.
Ayguade, Eduard
Labarta, Jesus
[J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2011, 6548 : 215 - +
[2] Parallel Computing with GPUs
Elster, Anne C.
Requena, Stephane
[J]. PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 533 - 535
[3] Concurrent FFT computing on multicore processors
Barhen, J.
Humble, T.
Mitra, P.
Imam, N.
Schleck, B.
Kotas, C.
Traweek, M.
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (01): : 29 - 44
[4] Parallel evidence propagation on multicore processors
Yinglong Xia
Viktor K. Prasanna
[J]. The Journal of Supercomputing, 2011, 57 : 189 - 202
[5] Parallel Evidence Propagation on Multicore Processors
Xia, Yinglong
Feng, Xiaojun
Prasanna, Viktor K.
[J]. PARALLEL COMPUTING TECHNOLOGIES, PROCEEDINGS, 2009, 5698 : 377 - +
[6] Parallel evidence propagation on multicore processors
Xia, Yinglong
Prasanna, Viktor K.
[J]. JOURNAL OF SUPERCOMPUTING, 2011, 57 (02): : 189 - 202
[7] ParTejas: A Parallel Simulator for Multicore Processors
Malhotra, Geetika
Kalayappan, Rajshekar
Goel, Seep
Aggarwal, Pooja
Sagar, Abhishek
Sarangi, Smruti R.
[J]. ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2017, 27 (03):
[8] A Performance and Energy Comparison of Convolution on GPUs, FPGAs, and Multicore Processors
Fowers, Jeremy
Brown, Greg
Wernsing, John
Stitt, Greg
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 9 (04)
[9] ParTejas: A Parallel Simulator for Multicore Processors
Malhotra, Geetika
Aggarwal, Pooja
Sagar, Abhishek
Sarangi, Smruti R.
[J]. 2014 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), 2014, : 130 - 131
[10] High Performance Parallelization of COMPSYN on a Cluster of Multicore Processors with GPUs
Alessi, Ferdinando
Massini, Annalisa
Basili, Roberto
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 966 - 975

← 1 2 3 4 5 →