High-Speed FPGA Implementation of SIKE Based on an Ultra-Low-Latency Modular Multiplier

被引:21
|
作者
Tian, Jing [1 ]
Wu, Bo [1 ]
Wang, Zhongfeng [1 ]
机构
[1] Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Peoples R China
基金
中国国家自然科学基金;
关键词
Modular multiplication; supersingular isogeny key encapsulation (SIKE); elliptic curve cryptography (ECC); post-quantum cryptography (PQC); hardware implementation; FPGA; ISOGENY DIFFIE-HELLMAN;
D O I
10.1109/TCSI.2021.3094889
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The supersingular isogeny key encapsulation (SIKE) protocol, as one of the post-quantum protocol candidates, is widely regarded as the best alternative for curve-based cryptography. However, the long latency, caused by the serial large-degree isogeny computation which is dominated by modular multiplications, has made it less competitive than most popular post-quantum candidates. In this paper, we propose a high-speed and low-latency architecture for our recently presented optimized SIKE algorithm. Firstly, we design a new field arithmetic logic unit (FALU) with many algorithmic transformations and architectural optimizations. Especially, for the FALU, an extremely low-latency modular multiplier is devised based on a modified algorithm by fully parallelizing and highly optimizing the small-size multipliers and the reduction submodules. Secondly, we develop a compact control logic and update the instructions based on the benchmark provided in the newest SIKE library, fitting well with our design. Thirdly, an efficient memory access method is proposed by scheduling the input and output of the arithmetic logic unit (ALU) in two identical RAMs, which can significantly reduce the latency. Finally, we code the proposed architectures using the Verilog language and integrate them into the SIKE library. The implementation results on a Xilinx Virtex-7 FPGA show that for SIKEp751, our design only costs 9.3 Ens with a frequency of 155.8 MHz, about 2x faster than the state-of-the-art, and achieves the best area efficiency among existing works. Particularly, the modular multiplier merely needs 16 clock cycles, reducing the delay by nearly one order of magnitude with a small factor of increase in hardware resource.
引用
收藏
页码:3719 / 3731
页数:13
相关论文
共 50 条
  • [1] High-Speed Modular Multiplier for Lattice-Based Cryptosystems
    Tan, Weihang
    Case, Benjamin M.
    Wang, Antian
    Gao, Shuhong
    Lao, Yingjie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (08) : 2927 - 2931
  • [2] Design and Implementation of Ultra-Low-Latency Video Encoder Using High-Level Synthesis
    Fukaya, Kosuke
    Mori, Kaito
    Imamura, Kousuke
    Matsuda, Yoshio
    Matsumura, Tetsuya
    Mochizuki, Seiji
    2019 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2019,
  • [3] High-Speed Hybrid Multiplier Design Using a Hybrid Adder with FPGA Implementation
    Thamizharasan, V.
    Kasthuri, N.
    IETE JOURNAL OF RESEARCH, 2023, 69 (05) : 2301 - 2309
  • [4] Design and FPGA Implementation of High-Speed, Fixed-Latency Serial Transceivers
    Liu, Xue
    Deng, Qing-Xu
    Wang, Ze-Ke
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2014, 61 (01) : 561 - 567
  • [5] Hardware Implementation of a High Speed Floating Point Multiplier Based on FPGA
    Gong Renxi
    Zhang Shangjun
    Zhang Hainan
    Meng Xiaobi
    Gong Wenying
    Xie Lingling
    Huang Yang
    ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 1902 - +
  • [6] The Design and Implementation of High-Speed Codec Based on FPGA
    Ren, Weiji
    Liu, Hao
    2018 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN), 2018, : 427 - 432
  • [7] High-speed FPGA implementation of full-word Montgomery multiplier for ECC applications
    Khan, Safiullah
    Javeed, Khalid
    Shah, Yasir Ali
    MICROPROCESSORS AND MICROSYSTEMS, 2018, 62 : 91 - 101
  • [8] High-Speed and Low-Latency ECC Processor Implementation Over GF(2m) on FPGA
    Khan, Zia U. A.
    Benaissa, Mohammed
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (01) : 165 - 176
  • [9] Application of high-speed modulo multiplier to modular exponentiation
    Kudou, Tadamichi
    Tsunekawa, Yoshitaka
    IEEJ Transactions on Electronics, Information and Systems, 2009, 129 (02) : 388 - 389
  • [10] Design of a low latency high speed pipelining multiplier
    Wu, YJ
    Chen, HY
    Wei, SJ
    2001 4TH INTERNATIONAL CONFERENCE ON ASIC PROCEEDINGS, 2001, : 551 - 554