Exact floating-point subtraction theorem
In floating-point arithmetic, the Sterbenz lemma or Sterbenz's lemma[1] is a theorem giving conditions under which floating-point differences are computed exactly.
It is named after Pat H. Sterbenz, who published a variant of it in 1974.[2]
Sterbenz lemma — In a floating-point number system with subnormal numbers, if and are floating-point numbers such that
then is also a floating-point number.
Thus, a correctly rounded floating-point subtraction
is computed exactly.
The Sterbenz lemma applies to IEEE 754, the most widely used floating-point number system in computers.
Proof
Let be the radix of the floating-point system and the precision.
Consider several easy cases first:
- If is zero then , and if is zero then , so the result is trivial because floating-point negation is always exact.
- If the result is zero and thus exact.
- If then we must also have so . In this case, , so the result follows from the theorem restricted to .
- If , we can write with , so the result follows from the theorem restricted to .
For the rest of the proof, assume without loss of generality.
Write in terms of their positive integral significands and minimal exponents :
Note that and may be subnormal—we do not assume .
The subtraction gives:
Let .
Since we have:
- , so , from which we can conclude is an integer and therefore so is ; and
- , so .
Further, since , we have , so that
which implies that
Hence
so is a floating-point number.
◻
Note: Even if and are normal, i.e., , we cannot prove that and therefore cannot prove that is also normal.
For example, the difference of the two smallest positive normal floating-point numbers and is which is necessarily subnormal.
In floating-point number systems without subnormal numbers, such as CPUs in nonstandard flush-to-zero mode instead of the standard gradual underflow, the Sterbenz lemma does not apply.