Floating Point Numbers
Floating point numbers are a finite subset
of real numbers with limited
accuracy.
Floating point numbers are used to represent real numbers, but
not all real numbers can be represented exactly as floating point numbers.
Numbers which cannot be represented exactly are approximated.
The set of floating point numbers (denoted
F)
behaves differently than the set of real numbers (denoted
R).
Floating point numbers are the binary equivalent of
scientific notation, and have internal and external
formats. The internal formats are for performing arithmetic
operations on the numbers. The external formats are for
exporting (providing) the numbers
to programs and other computers.
Historically, floating point numbers were often represented
in IEEE format specified by the IEEE standard.
The IEEE standard not only provided a bit layout
(format) for the numbers, but also specified how arithmetic operations
should be performed on the internal representations of
those bit layouts.
Computers which processed floating point that way
were said to comply with the IEEE standard.
The internal IEEE format was traditionally used in math coprocessors.
In the future, computers will be able to use other formats internally,
because of new circuit designs and also because
not all IEEE format features are needed for
all types of numerical processing.
The IEEE external format variants have been adopted
by ANSI (the American National Standards Institute)
and will continue to be used for data exchange,
so it will be useful to know about the IEEE format.
We will discuss the IEEE format on the next page of this article.
Subsequent pages describe a higher precision floating point
format we are developing for our own use.
|
|
 |
|
When a floating point number
x
is represented in binary
due to
finite precision in computer representation,
not all numbers, even within the range
of the computer, can be represented exactly.
|
|
|
Daoqi Yang,
C++ and OO Numeric Computing,
p.16
|
|
all fractions which have a terminating
expansion in binary system will terminate in
decimal system also, but the converse is not true.
|
|
|
H.M. Antia,
Num. Meth. for Sci.& Eng.
2/e,
p.18
|
|
there is an infinity
of floating point numbers that cannot be represented
on any computer.
|
|
|
D.M. Capper,
Intro. C++ for Scientists
2/e,
p.325
|
|
the set of real numbers between 0 and 1
is not countable.
|
|
|
L. Wasserman,
All of Statistics,
p. 22
|
|
|