Fast AES Implementation: A High-Throughput Bitsliced Approach

被引:30
|
作者
Hajihassani, Omid [1 ,2 ]
Monfared, Saleh Khalaj [3 ,4 ]
Khasteh, Seyed Hossein [5 ]
Gorgin, Saeid [3 ,6 ]
机构
[1] Inst Univ Alberta, Edmonton, AB, Canada
[2] Res Fundamental Sci IPM, 1 Shahid Farbin Al Ley,POB 19395-5531, Tehran, Iran
[3] Inst Res Fundamental Sci IPM, POB 19395-5531, Tehran, Iran
[4] KN Toosi Univ Technol, 1 Shahid Farbin Alley,Shahid Lavasani St, Tehran, Iran
[5] KN Toosi Univ Technol, Fac Comp Engn, Shariati Ave,POB 16315-1355, Tehran, Iran
[6] IROST, Elect Engn & Informat Technol Dept, Tehran, Iran
关键词
AES; CTR; ECB; GPU; data representation; CUDA; high-performance; HIGH-PERFORMANCE; BLOCK CIPHERS; GPU;
D O I
10.1109/TPDS.2019.2911278
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this work, a high-throughput bitsliced AES implementation is proposed, which builds upon a new data representation scheme that exploits the parallelization capability of modern multi/many-core platforms. This representation scheme is employed as a building block to redesign all of the AES stages to tailor them for multi/many-core AES implementation. With the proposed bitsliced approach, each parallelization unit processes an unprecedented number of thirty-two 128-bit input data. Hence, a high order of prallelization is achieved by the proposed implementation technique. Based on the characteristics of this new implementation model, the ShiftRows stage can be implicitly handled through input rearrangement and is simplified to the point where its computing process can be neglected. In this implementation, costly Byte-wise operations are performed through register shift and swapping. In addition, the need for look-up table based I/O operations, which are used by the Substitute Bytes stage is eliminated through using S-box logic circuit. The S-box logic circuit is optimized to simultaneously process 32 chunks of 128-bit input data. We develop high-throughput CTR and ECB AES encryption/decryption on 6 CUDA-enabled GPUs, which achieve 1.47 and 1.38 Tbps of encryption throughput on Tesla V100 GPU, respectively.
引用
收藏
页码:2211 / 2222
页数:12
相关论文
共 50 条
  • [1] A High-Throughput Cost-Effective ASIC Implementation of the AES Algorithm
    Cao, Qingfu
    Li, Shuguo
    [J]. 2009 IEEE 8TH INTERNATIONAL CONFERENCE ON ASIC, VOLS 1 AND 2, PROCEEDINGS, 2009, : 805 - +
  • [2] High-Throughput Secure AES Computation
    Chida, Koji
    Hamada, Koki
    Ikarashi, Dai
    Kikuchi, Ryo
    Pinkas, Benny
    [J]. WAHC'18: PROCEEDINGS OF THE 6TH WORKSHOP ON ENCRYPTED COMPUTING & APPLIED HOMOMORPHIC CRYPTOGRAPHY, 2018, : 13 - 24
  • [3] A high-throughput area efficient FPGA implementation of AES-128 encryption
    Brokalakis, A
    Kakarountas, AP
    Goutis, CE
    [J]. 2005 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS - DESIGN AND IMPLEMENTATION (SIPS), 2005, : 116 - 121
  • [4] Hardware Implementation of High-Throughput S-Box in AES for Information Security
    Lin, Shih-Hsiang
    Lee, Jun-Yi
    Chuang, Chia-Chou
    Lee, Narn-Yih
    Chen, Pei-Yin
    Chin, Wen-Long
    [J]. IEEE ACCESS, 2023, 11 : 59049 - 59058
  • [5] Implementation of Bitsliced AES Encryption on CUDA-Enabled GPU
    Nishikawa, Naoki
    Amano, Hideharu
    Iwai, Keisuke
    [J]. NETWORK AND SYSTEM SECURITY, 2017, 10394 : 273 - 287
  • [6] BSRNG: A High Throughput Parallel BitSliced Approach for Random Number Generators
    Monfared, Saleh Khalaj
    Hajihassani, Omid
    Kiarostami, Mohammad Sina
    Zanjani, Soroush Meghdadi
    Rahmati, Dara
    Gorgin, Saeid
    [J]. 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOP PROCEEDINGS, ICPP 2020, 2020,
  • [7] High Throughput, Pipelined Implementation of AES on FPGA
    Qu, Shanxin
    Shou, Guochu
    Hu, Yihong
    Guo, Zhigang
    Qian, Zongjue
    [J]. IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 542 - 545
  • [8] Low-Power Implementation of a High-Throughput Multi-core AES Encryption Architecture
    Pham-Khoi Dong
    Hung K Nguyen
    Van-Phuc Hoang
    Xuan-Tu Trana
    [J]. APCCAS 2020: PROCEEDINGS OF THE 2020 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2020), 2020, : 74 - 77
  • [9] Vectorized AES Core for High-throughput Secure Environments
    Pericas, Miquel
    Chaves, Ricardo
    Gaydadjiev, Georgi N.
    Vassiliadis, Stamatis
    Valero, Mateo
    [J]. HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2008, 2008, 5336 : 83 - +
  • [10] Efficient electro-magnetic analysis of a GPU bitsliced AES implementation
    Gao, Yiwen
    Zhou, Yongbin
    Cheng, Wei
    [J]. CYBERSECURITY, 2020, 3 (01)