Faster Modular Exponentiation using Double Precision Floating Point Arithmetic on the GPU

被引：0

作者：

Emmart, Niall ^{[1
]}

Zheng, Fangyu ^{[2
,3
]}

Weems, Charles ^{[1
]}

机构：

[1] Univ Massachusetts, Coll Informat & Comp Sci, Amherst, MA 01003 USA

[2] Chinese Acad Sci, State Key Lab Informat Secur, Inst Informat Engn, Beijing, Peoples R China

[3] Chinese Acad Sci, Data Assurance & Commun Secur Res Ctr, Beijing, Peoples R China

来源：

2018 IEEE 25TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH) | 2018年

基金：

美国国家科学基金会; 国家重点研发计划;

关键词：

MULTIPLICATION;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper presents a new approach to integer multiple precision (MP) modular exponentiation, using double-precision floating point (DPF) operations, that is suitable for GPU implementation. We show speedups ranging from 20% to 34% over the best prior GPU times for sizes corresponding to common RSA cryptographic operations (2048 to 4096 bits). Three techniques are described. First, by adding 2(104) to the high half of the product, and 2(52) to the low half, we set the implicit leading 1 in the DPF mantissa so that the full 52 explicit bits are available for each half of the 104-bit products of samples. Second, the DPF values are cast bitwise to 64-bit integers for adding the column sums to get the MP result Normally the cast would require masking off the exponents, but because they are constant, we can include them in the column sums and correct just once for their total. Third, by initializing the column sums with the appropriate negative value to compensate for the exponent sums, no corrective subtraction is needed. Our implementation on an NVIDIA GTX Titan Black GPU achieves between 132.5K and 161.9K modular exponentiations per second of size 1024 bits, with latencies ranging from 21.7 ms to 17.8 ms, making it practical for online RSA applications. Proportional results are shown for 1536 and 2048 bits. The implementation is so efficient that its maximum sustained performance is actually bounded by the thermal limit of the GPU.

引用

页码：130 / 137

页数：8

共 50 条

[1] Double precision floating-point arithmetic on FPGAs
Paschalakis, S
Lee, P
2003 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), PROCEEDINGS, 2003, : 352 - 358
[2] Algorithms for quad-double precision floating point arithmetic
Hida, Y
Li, XS
Bailey, DH
ARITH-15 2001: 15TH SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 2001, : 155 - 162
[3] ARBITRARY PRECISION FLOATING-POINT ARITHMETIC
MOTTELER, FC
DR DOBBS JOURNAL, 1993, 18 (09): : 28 - &
[4] Faster Gaussian Lattice Sampling Using Lazy Floating-Point Arithmetic
Ducas, Leo
Nguyen, Phong Q.
ADVANCES IN CRYPTOLOGY - ASIACRYPT 2012, 2012, 7658 : 415 - 432
[5] Arithmetic Algorithms for Extended Precision Using Floating-Point Expansions
Joldes, Mioara
Marty, Olivier
Muller, Jean-Michel
Popescu, Valentina
IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (04) : 1197 - 1210
[6] A Modular-Positional Computation Technique for Multiple-Precision Floating-Point Arithmetic
Isupov, Konstantin
Knyazkov, Vladimir
PARALLEL COMPUTING TECHNOLOGIES (PACT 2015), 2015, 9251 : 47 - 61
[7] Arithmetic operations beyond floating point number precision
Wang, Chih-Yueh
Yin, Chen-Yang
Chen, Hong-Yu
Chen, Yung-Ko
INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2011, 6 (03) : 206 - 215
[8] SIMULATING LOW PRECISION FLOATING-POINT ARITHMETIC
Higham, Nicholas J.
Pranesh, Srikara
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2019, 41 (05): : C585 - C602
[9] A double precision floating point multiply
Montoye, R
Belluomini, W
Ngo, H
McDowell, C
Sawada, J
Nguyen, T
Veraa, B
Wagoner, J
Lee, M
2003 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE: DIGEST OF TECHNICAL PAPERS, 2003, 46 : 336 - 337
[10] Efficient Realization of Table Look-up based Double Precision Floating Point Arithmetic
Merchant, Farhad
Choudhary, Nimash
Nandy, S. K.
Narayan, Ranjani
2016 29TH INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2016 15TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID), 2016, : 415 - 420

← 1 2 3 4 5 →