Low Latency and Low Error Floating-Point Sine/Cosine Function Based TCORDIC Algorithm

被引：17

作者：

Zhu, Baozhou ^{[1
]}

Lei, Yuanwu ^{[1
]}

Peng, Yuanxi ^{[1
]}

He, Tingting ^{[1
]}

机构：

[1] Natl Univ Def Technol, Sch Comp, Changsha 410073, Hunan, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2017年 / 64卷 / 04期

基金：

中国国家自然科学基金;

关键词：

CORDIC; floating-point sine/cosine; low latency; Taylor; RADIX-4 CORDIC ALGORITHM; ARCHITECTURE; GENERATION; PROCESSOR; HARDWARE;

D O I：

10.1109/TCSI.2016.2631588

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

CORDIC algorithm is suitable to implement sine/cosine function, but the large number of iterations lead to great delay and overhead. Moreover, due to finite bit-width of operands and number of iterations, the relative error of floating-point sine or cosine is terrible when the input angle is close to 0 or pi/2, respectively. To overcome these short-comings, TCORDIC algorithm, which combines low latency CORDIC and Taylor algorithm, is presented. After analyzing the latency of traditional CORDIC, low latency CORDIC is proposed, which adopts the technique of sign prediction, compressive iterations, and parallel iterations. Besides, the calculating boundary (N), which is used for determining whether Taylor algorithm is selected or not in TCORDIC algorithm, is evaluated to achieve a trade-off between area and delay. Truncated multipliers are used to reduce the area further. Finally, Using TCORDIC algorithm, pipelined and iterative structures are implemented for IEEE-754 double precision floating-point sine/cosine with the input Z epsilon [0, pi/2]. Under typical condition (1V, 25 degrees C), our designs are synthesized with 40 nm standard cell library. For a pipelined structure, the frequency is up to 1.70 GHz and area 194049.64 mu m(2). Frequency decreases to 1.45 GHz for iterative structure, but the area requires only 110590.81 mu m(2). TCORDIC is efficient in controlling relative error, and achieves the accuracy within one ulp (unit in the last place) for floating-point sine/cosine function.

引用

页码：892 / 905

页数：14

共 50 条

[1] A Low Latency Floating-Point Sine and Cosine Function Hardware Implementation Algorithm
Liang F.
Liu C.
Li X.
Qiu G.
Zhang J.
Chen Z.
Li W.
Cao Q.
Lei S.
Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2021, 55 (11): : 106 - 114
[2] A Low Latency Floating Point CORDIC Algorithm for Sin/Cosine Function
Hou, Nanxin
Wang, Mingjiang
Zou, Xiafeng
Liu, Ming
2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 751 - 755
[3] Simultaneous Floating-Point Sine and Cosine for VLIW Integer Processors
Jeannerod, Claude-Pierre
Jourdan-Lu, Jingyan
2012 IEEE 23RD INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2012, : 69 - 76
[4] A floating-point processor for fast and accurate sine/cosine evaluation
Paliouras, V
Karagianni, K
Stouraitis, T
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2000, 47 (05) : 441 - 451
[5] Low Latency Floating-Point Division and Square Root Unit
Bruguera, Javier D.
IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (02) : 274 - 287
[6] VLSI architecture for fast and accurate floating-point sine/cosine evaluation
Paliouras, V.
Karagianni, K.
Stouraitis, T.
Proceedings of the IEEE International Conference on Electronics, Circuits, and Systems, 1998, 1 : 473 - 476
[7] FPGA Implementation of CORDIC Algorithms for Sine and Cosine Floating-Point Calculations
Sergiyenko, Anatoliy
Moroz, Leonid
Mychuda, Lesya
Samotyj, Volodymir
PROCEEDINGS OF THE THE 11TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS (IDAACS'2021), VOL 1, 2021, : 383 - 386
[8] Low-resource low-latency hybrid adaptive CORDIC with floating-point precision
Hong-Thu Nguyen
Xuan-Thuan Nguyen
Trong-Thuc Hoang
Duc-Hung Le
Cong-Kha Pham
IEICE ELECTRONICS EXPRESS, 2015, 12 (09):
[9] Variable-Latency Floating-Point Multipliers for Low-Power Applications
Kuang, Shiann-Rong
Wang, Jiun-Ping
Hong, Hua-Yi
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2010, 18 (10) : 1493 - 1497
[10] Low-Latency Double-Precision Floating-Point Division for FPGAs
Liebig, Bjoern
Koch, Andreas
PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2014, : 107 - 114

← 1 2 3 4 5 →