Shenwei-26010: A High-Performance Many-Core Processor

被引:0
|
作者
Hu X. [1 ]
Ke X. [1 ]
Yin F. [1 ]
Zhao X. [1 ]
Ma Y. [1 ]
Yan S. [1 ]
Ma C. [1 ]
机构
[1] Shanghai High-Performance Integrated Circuit Design Center, Shanghai
关键词
Computation-control core; Computing core; Energy-efficiency-ration; Low power design; Shenwei instruction set;
D O I
10.7544/issn1000-1239.2021.20201041
中图分类号
学科分类号
摘要
Based on the multi-core processor Shenwei 1600, the high-performance many-core processor Shenwei 26010 adopts SoC (system on chip) technology, and integrates 4 computing-control cores and 256 computing cores in a single chip. It adopts a 64-bit RISC (reduced instruction set computer) instruction set designed with an original design, and supports 256-bit SIMD (single instruction multiple data) integer and floating-point vector-acceleration operations. Its peak performance for double precision floating-point operations reaches 3.168TFLOPS. Shenwei 26010 processor is manufactured using 28 nm process technology. The die area of the chip is more than 500 mm2, and the 260 cores of the chip can run stably with a frequency of 1.5 GHz. Shenwei 26010 processor adopts a variety of low power-consumption designs on the architecture level, the microarchitecture level, and the circuit level, and thus, leading to a peak energy-efficiency-ratio of 10.559GFLOPS/W. Notably, both the operating frequency and the energy-efficiency-ratio of the chip are higher than those of the worldwide contemporary processor products. Through the technical innovations of high frequency design, stable reliability design and yield design, Shenwei 26010 has effectively solved the issues of high frequency target, power consumption wall, stability and reliability, and yield, all of which are encountered when pursuing the goal of high-performance computing. It has been applied successfully to a 100PFLOPS supercomputer system named "Sunway TaihuLight" on a large scale, and therefore, can adequately meet the computing requirements for both scientific and engineering applications. © 2021, Science Press. All right reserved.
引用
收藏
页码:1155 / 1165
页数:10
相关论文
共 16 条
  • [1] Hu Xiangdong, Yang Jianxin, Zhu Ying, Shenwei-1600: A high-performance multi-core microprocessor, Scientia Sinica Informationis, 45, 4, pp. 513-522, (2015)
  • [2] Hart J, Butler S, Cho H, Et al., 3.6 GHz 16-core SPARC SoC processor in 28 nm, Proc of IEEE Solid-State Circuits Conf Digest of Technical Papers, pp. 48-50, (2013)
  • [3] Konstadinidis G K, Li H P, Schumacher F, Et al., SPARC M7: A 20 nm 32-Core 64MB L3 cache processor, IEEE Journal of Solid-State Circuits, 51, 1, pp. 79-91, (2015)
  • [4] Bryant R E., A methodology for hardware verification based on logic simulation, Journal of the ACM, 38, 2, pp. 299-328, (1991)
  • [5] Taylor S, Quinn M, Brown D, Et al., Functional verification of a multiple-issue, out-of-order, superscalar Alpha processor-the DEC Alpha 21264 microprocessor, Proc of the 35th Design Automation Conf (DAC'98), pp. 638-643, (1998)
  • [6] Zhang Hang, Shen Haihua, Function verification of Godson2 processor, Journal of Computer Research and Development, 43, 6, pp. 974-979, (2006)
  • [7] Schubert K D, Roesner W, Ludden J M, Et al., Functional verification of the IBM POWER7 microprocessor and POWER7 multiprocessor systems, IBM Journal of Research & Development, 55, 3, pp. 10-17, (2011)
  • [8] Zhu Ying, Chen Cheng, Xu Xiaohong, Et al., Creation of FPGA verification platform for a high performance multiple-core microprocessor, Journal of Computer Research and Development, 51, 6, pp. 1295-1303, (2014)
  • [9] Ludden J M, Roesner W, Heiling G M, Et al., Functional verification of the POWER4 microprocessor and POWER4 multiprocessor systems, IBM Journal of Reseatch & Development, 46, 1, pp. 53-76, (2002)
  • [10] Hashimoto T, Kawabe Y, Hara M, Et al., An adaptive clocking control circuit with 7.5% frequency gain for SPARC processors, IEEE Journal of Solid-State Circuits, 53, 4, pp. 1028-1037, (2017)