|
|||||
|
|||||
Floating point data type |
The floating point data type allows you to store numbers in floating point form.
This means that the numbers behave like numbers written in scientific notation -- they can have not only a number, but also a base and an exponent.
Floating point numbers are particularly appropriate for physics and astronomical calculations -- calculations where the result is either very small or very large.
Floating point data type formatting lets you control the number of digits in the output, or to pad with spaces on the right or on the left.
BCD numbers are generally superior to floating point numbers for most applications. There are three principal differences between these two types:
Floating point numbers are represented internally as binary (base 2) numbers. They provide precise representation of fractional numbers that are powers of 2 (1/2, 1/4, 1/8, 1/16, and so forth), but they do not provide precise representation of fractions that are powers of 10 (1/10, 1/100, 1/1000). Any fraction that can be precisely represented in base 2 can be precisely represented in base 10, but not vice versa. (There are, of course, many fractions that cannot be precisely represented in either base 2 or base 10 -- 1/3 for example.)
Floating point numbers are of a limited size and are represented by a fixed number of bytes of memory. BCD numbers, as implemented by the OmniMark BCD library, are of unlimited size.
Floating point numbers are limited in precision.
Floating point numbers, as their name implies, have a floating decimal point. That is, floating point numbers have a fixed number of significant bits which are distributed between the whole number portion and the fractional portion of the number. The larger the whole number portion of the number, the fewer bits are available for the fractional part.
You can mix integer variables and floating point variables in mathematical expressions. Thus, you can write:
import "omfloat.xmd" unprefixed process local float price initial {6.37 * float 10 ** 3} local float total local integer quantity initial {3} set total to quantity * price output "Total = " || "d" % total || "%n" ;Output: "Total = 19110"
Note that if you perform an operation on two integers and assign the result to a floating point number, the operation will be done as an integer operation and the result will be coerced to a float. Thus the following code will fail, even though a float can hold the result of 1000000 * 2000000:
import "omfloat.xmd" unprefixed process local integer large initial {1000000} local integer larger initial {2000000} local float largest set largest to float(large * larger) output "Largest = " || "d" % largest || "%n" ;Output: "Largest = -1454759936" (This is incorrect.)
In this case, the result of the integer operation large * larger
will overflow before the coercion to a floating point number. The correct way to code this operation is to force one of the operands to float before the operation is performed. This causes the operation to be performed as a floating point operation, returning a floating point value:
import "omfloat.xmd" unprefixed process local integer large initial {1000000} local integer larger initial {2000000} local float largest set largest to float large * larger output "Largest = " || "d" % largest || "%n" ;Output: "Largest = 2000000000000" (This is correct).
You can use the following operators with floating point numbers:
+
-
*
/
modulo
abs
ceiling
floor
round
truncate
<
>
<=
>=
=
!=
%
In the event of an error in a calculation, the Floating Point library will return NaN
. NaN means "Not a Number".
import "omfloat.xmd" unprefixed process local float total initial {2.2} local stream foo initial {"foo"} set total to total + foo output "Total = " || "d" % total || "%n" ; Output: "Total = NaN" ; Note: "NaN" means "Not a Number"