| |||||||||
IEEE 754r is an ongoing revision to the IEEE 754 floating point standard. The intent of the revision is to extend the standard where it has become necessary, to tighten up certain areas of the orginal standard which were left undefined, and to merge in IEEE 854 (the radix-independent floating-point standard).
Where stricter definitions are performance-incompatible with some existing implementation, they are placed in a new section, allowing two levels of implementation.
The standard has been under revision since 2000, with a target completion date of december 2005. Participation is open to people with a solid knowledge of floating-point arithmetic. Monthly meetings are held in the San Francisco Bay area. The mailing list reflects ongoing discussions.
The most obvious enhancements to the standard are the addition of 128-bit and decimal formats, and some new operations, however there have been significant clarifications in terminology throughout. This summary highlights the major differences in each major section of the standard. Note that the revision is not yet an approved standard—so all these changes are, in effect, proposals.
The scope has been widened to include decimal formats and arithmetic.
Many of the definitions have been rewritten for clarification and consistency. A few terms have been renamed for clarity (for example, denormalized has been renamed to subnormal).
The specification levels of a floating-point format have been enumerated, to clarify the distinction between
The sets of representatable entities are then explained in detail, showing that they can be treated with the significand being considered either as a fraction or an integer.
The basic binary formats have the 'quad' (128-bit) format added.
Three new decimal formats are described, matching the lengths of the binary formats. These give decimal formats with 7, 16, and 34-digit significands, which may be normalized or unnormalized. For maximum range and precision, the formats merge part of the exponent and significand into a combination field, and compress the remainder of the significand using densely packed decimal encoding.
The round-to-nearest, ties away from zero rounding mode has been added (required for decimal operations only).
This section has numerous clarifications (notably in the area of comparisons), several previously recommended operations (quiet copy, negate, abs, and copysign) are now required.
New operations include Fused multiply-add (FMA), classification predicates (isnan(x), etc.), various min and max functions (which allow a total ordering), and two decimal-specific operations (samequantum and quantize).
The min and max operations are defined in such a way that they are commutative (except for the case of two NaNs as inputs). In particular:
min(+0,-0) = min(-0,+0) = -0
max(+0,-0) = max(-0,+0) = +0
In order to support operations such as Java, C#, PL/I, COBOL, REXX, etc., is also defined in this section.
These sections have been revised, but with no major additions; some aspects remain under discussion.
This new section defines a second level of conformance to the standard, which specifies extensions compatible with the IEEE 754 standard but which could cause significant performance degradation for existing implementations in some circumstances.
These include:
There are several changes in the annexes; e.g. the traps mechanism has been moved to an annex. Traps were not required by IEEE 754-1985, however many readers of the standard assumed they were. In the revision, there is an attempt to focus more on the what functionality the system should provide for dealing with exceptional cases. While traps are one way to implement these features, there are other approaches available.
New Annexes are currently (2004) under discussion.