Accurate Floating-point Operation using Controlled Floating-point Precision

被引：0

作者：

Zaki, Ahmad M. ^{[1
]}

Bahaa-Eldin, Ayman M. ^{[1
]}

El-Shafey, Mohamed H. ^{[1
]}

Aly, Gamal M. ^{[1
]}

机构：

[1] Ain Shams Univ, Dept Comp & Syst Engn, Cairo, Egypt

来源：

2011 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING (PACRIM) | 2011年

关键词：

dot-Product; floating-point; Hilbert matrix; accurate multiplication; accurate sum; ill-conditioned matrix; machine-epsilon; relative error;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Rounding and accumulation of errors when using floating point numbers are important factors in computer arithmetic. Many applications suffer from these problems. The underlying machine architecture and representation of floating point numbers play the major role in the level and value of errors in this type of calculations. A quantitative measure of a system error level is the machine epsilon. In the current representation of floating point numbers, the machine epsilon can be as small as 9.63E-35 in the 128 bit version of IEEE standard floating point representation system. In this work a novel solution that guarantees achieving the desired minimum error regardless of the machine architecture is presented. The proposed model can archive a machine epsilon of about 4.94E-324. A new representation model is given and a complete arithmetic system with basic operations is presented. The accuracy of the proposed method is verified by inverting a high order, Hilbert matrix, an ill-conditioned matrix that cannot be solved in the traditional floating point standard. Finally some comparisons are given.

引用

页码：696 / 701

页数：6

共 50 条

[31] A multi-precision floating-point adder
Ozbilen, Metin Mete
Gok, Mustafa
PRIME: 2008 PHD RESEARCH IN MICROELECTRONICS AND ELECTRONICS, PROCEEDINGS, 2008, : 117 - 120
[32] TAPERED FLOATING POINT - NEW FLOATING-POINT REPRESENTATION
MORRIS, R
IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (12) : 1578 - &
[33] FLOATING-POINT COMPUTATION USING A MICROCONTROLLER
RANDAL, VT
SCHMALZEL, JL
SHEPHERD, AP
PROCEEDINGS OF THE ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, PTS 1-4, 1988, : 1243 - 1244
[34] Arithmetic Algorithms for Extended Precision Using Floating-Point Expansions
Joldes, Mioara
Marty, Olivier
Muller, Jean-Michel
Popescu, Valentina
IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (04) : 1197 - 1210
[35] Floating-point fused multiply-add: Reduced latency for floating-point addition
Bruguera, JD
Lang, T
17TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 2005, : 42 - 51
[36] Floating-point LLL revisited
Nguyên, PQ
Stehlé, D
ADVANCES IN CRYPTOLOGY - EUROCRYPT 2005,PROCEEDINGS, 2005, 3494 : 215 - 233
[37] Termination of Floating-Point Computations
Alexander Serebrenik
Danny De Schreye
Journal of Automated Reasoning, 2005, 34 : 141 - 177
[38] FLOATING-POINT WITHOUT A COPROCESSOR
GREHAN, R
BYTE, 1988, 13 (09): : 313 - &
[39] BINARY FLOATING-POINT RESISTOR
PAKER, Y
IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (01) : 7 - &
[40] An Effective Floating-Point Reciprocal
Moroz, Leonid
Samotyy, Volodymyr
Horyachyy, Oleh
PROCEEDINGS OF THE 2018 IEEE 4TH INTERNATIONAL SYMPOSIUM ON WIRELESS SYSTEMS WITHIN THE INTERNATIONAL CONFERENCES ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS (IDAACS-SWS), 2018, : 137 - 141

← 1 2 3 4 5 →