Examples¶
A simple example to start from: x * (1 - x)¶
The C program¶
Let us analyze the following function.
float f(float x)
{
assert(0 <= x && x <= 1);
return x * (1 - x);
}
This function computes the value x * (1 - x)
for an argument x
between 0 and 1. The float
type is meant to force the compiler to
use IEEE-754 single precision floating-point numbers. We also assume
that the default rounding mode is used: rounding to nearest number,
with tie breaking to even.
The function returns the value \(x \otimes (1 \ominus x)\) instead of the ideal value \(x \cdot (1 - x)\) due to the limited precision of the computations. If we rule out the overflow possibility (floating-point numbers are limited, not only in precision, but also in range), the returned value is also \(\circ(x \cdot \circ(1 - x))\). This o function is a unary operator related to the floating-point format and the rounding mode used for the computations. This is the form Gappa works on.
First Gappa version¶
We first try to bound the result of the function. Knowing that x
is
in the interval [0,1]
, what is the enclosing interval of the
function result? It can be expressed as an implication: If x
is in
[0,1]
, then the result is in … something. Since we do not want to
enforce some specific bounds yet, we use a question mark instead of some
explicit bounds.
The logical formula Gappa has to verify is enclosed between braces. The
rounding operator is a unary function float<ieee_32, ne>
. The
result of this function is a real number that would fit in a IEEE-754
single precision (ieee_32
) floating-point number (float
), if
there was no overflow. This number is potentially a subnormal number and
it was obtained by rounding the argument of the rounding operator toward
nearest even (ne
).
The following Gappa script finds an interval such that the logical formula describing our previous function is true.
{ x in [0,1] -> float<ieee_32,ne>(x * float<ieee_32,ne>(1 - x)) in ? }
Gappa answers that the result is between 0 and 1. Without any help from the user, they are the best bounds Gappa is able to prove.
Results:
float<24,-149,ne>(x * float<24,-149,ne>(1 - x)) in [0, 1]
Defining notations¶
Directly writing the completely expanded logical formula is fine for small formulas, but it may become tedious once the problem gets big enough. For this reason, notations can be defined to avoid repeating the same terms over and over. These notations are all written before the logical formula.
For example, if we want not only the resulting range of the function,
but also the absolute error, we need to write the expression twice.
Let us give it the name y
.
y = float<ieee_32,ne>(x * float<ieee_32,ne>(1 - x));
{ x in [0,1] -> y in ? /\ y - x * (1 - x) in ? }
We can simplify the input a bit further by giving a name to the rounding operator too.
@rnd = float<ieee_32, ne>;
y = rnd(x * rnd(1 - x));
{ x in [0,1] -> y in ? /\ y - x * (1 - x) in ? }
These explicit rounding operators right in the middle of the expressions make it tedious to translate the initial C code. We can avoid this issue by writing the rounding operator before the equal sign.
@rnd = float<ieee_32, ne>;
y rnd= x * (1 - x);
{ x in [0,1] -> y in ? /\ y - x * (1 - x) in ? }
Please note that this implicit rounding operator only applies to the
results of arithmetic operations. In particular, a rnd= b
is not
equivalent to a = rnd(b)
. It is equivalent to a = b
.
Finally, we can also give a name to the infinitely precise result of the function to clearly show that both expressions have a similar arithmetic structure.
@rnd = float<ieee_32, ne>;
y rnd= x * (1 - x);
z = x * (1 - x);
{ x in [0,1] -> y in ? /\ y - z in ? }
On the script above, Gappa displays:
Results:
y in [0, 1]
y - z in [-1b-24 {-5.96046e-08, -2^(-24)}, 1b-24 {5.96046e-08, 2^(-24)}]
Gappa displays the bounds it has computed. Numbers enclosed in braces are approximations of the numbers on their left. These exact left numbers are written in decimal with a power-of-two exponent. The precise format will be described below.
Improved version¶
The bounds above are not as tight as they could actually be. Let us see how to expand Gappa’s search space in order for it to find better bounds. Not only Gappa will be able provide a proof of the optimal bounds for the result of the function, but it will also prove a tight interval on the computational absolute error.
Notations¶
x = rnd(xx); # x is a floating-point number
y rnd= x * (1 - x); # equivalent to y = rnd(x * rnd(1 - x))
z = x * (1 - x);
The syntax for notations is simple. The left-hand-side identifier is a name representing the expression on the right-hand side. Using one side or the other in the logical formula is strictly equivalent. Gappa will use the defined identifier when displaying the results and generating the proofs though, in order to improve their readability.
The second and third notations have already been presented. The first
one defines x
as the rounded value of a real number xx
. In the
previous example, we had not expressed this property of x
, which is a
floating-point number. This additional piece of information will help
Gappa to improve the bound on the error bound. Without it, a theorem
like Sterbenz’ lemma cannot apply to the 1 - x
subtraction.
Logical formulas and numbers¶
{ x in [0,1] -> y in [0,0.25] /\ y - z in [-3b-27,3b-27] }
Numbers and bounds can be written either in the usual scientific decimal
notation or by using a power-of-two exponent: 3b-27
means \(3
\cdot 2^{-27}\). Numbers can also be written with the C99 hexadecimal
notation: 0x0.Cp-25
is another way to express the bound on the
absolute error.
Hints¶
Although we have provided additional information through the definition
of x
as a rounded number, Gappa is not yet able to prove the formula.
It needs another hint from the user.
z -> 0.25 - (x - 0.5) * (x - 0.5); # x * (1 - x) == 1/4 - (x - 1/2)^2
This rewriting hint indicates to Gappa that, when bounding the left-hand side, it can use an enclosure of the right-hand side. Please note that this rewriting rule only applies when Gappa tries to bound the left-hand side, not when it tries to bound a larger expression that contains the left-hand side as a sub-expression.
In some cases, Gappa might also need to perform a case split to prove the proposition. On this specific example, the
user does not have to provide the corresponding hint because the tool
automatically guesses that it should split the enclosure of x
into
smaller sub-intervals. If Gappa had failed to do so, the following hint
would have to be added:
y, y - z $ x; # not needed, as Gappa already guessed it
It indicates that Gappa should split the enclosure of x
until all the
enclosures pertaining to y
and y - z
in the proposition have been
proved.
Full listing¶
To conclude, here is the full listing of this example.
# some notations
@rnd = float<ieee_32, ne>;
x = rnd(xx); # x is a floating-point number
y rnd= x * (1 - x); # equivalent to y = rnd(x * rnd(1 - x))
z = x * (1 - x); # the value we want to approximate
# the logical proposition
{ x in [0,1] -> y in [0,0.25] /\ y - z in [-3b-27,3b-27] }
# hints
z -> 0.25 - (x - 0.5) * (x - 0.5); # x * (1 - x) == 1/4 - (x - 1/2)^2
Since Gappa succeeded in proving the whole proposition and there was no unspecified range in it, the tool does not display anything.
Tang’s exponential function¶
The algorithm¶
In Table-Driven Implementation of the Exponential Function in IEEE Floating-Point Arithmetic, Ping Tak Peter Tang described an implementation of an almost correctly-rounded elementary function in single and double precision. John Harrison later did a complete formal proof in HOL Light of the implementation in Floating point verification in HOL Light: the exponential function.
Here we just focus on the tedious computation of the rounding error. We
consider neither the argument reduction nor the reconstruction part
(trivial anyway, excepted when the end result is subnormal). We want to
prove that, in the C code below, the absolute error between e
and
the exponential E0
of R0
(the ideal reduced argument) is less
than 0.54 ulp. Variable n
is an integer and all the other variables
are single-precision floating-point numbers.
r2 = -n * l2;
r = r1 + r2;
q = r * r * (a1 + r * a2);
p = r1 + (r2 + q);
s = s1 + s2;
e = s1 + (s2 + s * p);
Let us note R
the computed reduced argument and S
the stored
value of the ideal constant S0
. There are 32 such constants. For the
sake of simplicity, we only consider the second
constant: \(2^{\frac{1}{32}}\). E
is the value of the expression
e
computed with an infinitely precise arithmetic. Z
is the
absolute error between the polynomial \(x + a_1 \cdot x^2 + a_2
\cdot x^3\) and \(\exp(x) - 1\) for \(|x| \le \frac{\log 2}{64}\).
Gappa description¶
@rnd = float<ieee_32, ne>;
a1 = 8388676b-24;
a2 = 11184876b-26;
l2 = 12566158b-48;
s1 = 8572288b-23;
s2 = 13833605b-44;
r2 rnd= -n * l2;
r rnd= r1 + r2;
q rnd= r * r * (a1 + r * a2);
p rnd= r1 + (r2 + q);
s rnd= s1 + s2;
e rnd= s1 + (s2 + s * p);
R = r1 + r2;
S = s1 + s2;
E0 = S0 * (1 + R0 + a1 * R0 * R0 + a2 * R0 * R0 * R0 + Z);
{ Z in [-55b-39,55b-39] /\ S - S0 in [-1b-41,1b-41] /\ R - R0 in [-1b-34,1b-34] /\
R in [0,0.0217] /\ n in [-10176,10176]
->
e in ? /\ e - E0 in ? }
Please note the way Z
is introduced. Its definition is backward:
Z
is a real number such that E0
is the product of S0
and the
exponential of R0
. It makes for clearer definitions and it avoids
having to deal with divisions.
Results:
e in [4282253b-22 {1.02097, 2^(0.0299396)}, 8768135b-23 {1.04524, 2^(0.0638374)}]
e - E0 in [-13458043620277891b-59 {-0.023346, -2^(-5.42068)}, 3364512538651833b-57 {0.023346, 2^(-5.42068)}]
Gappa is able to bound both expressions. While the bounds for e
seem
sensible, the bounds for e - E0
are grossly overestimated. This
overestimation comes from the difference between the structures of e
and E0
. To improve the bounds on the error e - E0
, we split it
into two parts: a round-off error and a term that combines both
discretization and truncation errors. The round-off error is expressed
by introducing a term E
with the same structure as e
but without
any rounding operator.
E = s1 + (s2 + S * (r1 + (r2 + R * R * (a1 + R * a2))));
The round-off error is e - E
, while the other term is E - E0
.
As before, the expressions E
and E0
are structurally different,
so Gappa grossly overestimates the bounds of E - E0
. This time, we
introduce a term Er
with the same structure as E0
but equal to
E
. Since Z
has no corresponding term in E
, we insert an
artificial term 0
in Er
to obtain the same structure.
Er = S * (1 + R + a1 * R * R + a2 * R * R * R + 0);
Finally, we tell Gappa how to compute e - E0
using E
and Er
.
e - E0 -> (e - E) + (Er - E0);
Note that, rather than using a hint, this equality could also have been indicated as a hypothesis of the logical formula.
Full listing¶
@rnd = float< ieee_32, ne >;
a1 = 8388676b-24;
a2 = 11184876b-26;
l2 = 12566158b-48;
s1 = 8572288b-23;
s2 = 13833605b-44;
r2 rnd= -n * l2;
r rnd= r1 + r2;
q rnd= r * r * (a1 + r * a2);
p rnd= r1 + (r2 + q);
s rnd= s1 + s2;
e rnd= s1 + (s2 + s * p);
R = r1 + r2;
S = s1 + s2;
E = s1 + (s2 + S * (r1 + (r2 + R * R * (a1 + R * a2))));
Er = S * (1 + R + a1 * R * R + a2 * R * R * R + 0);
E0 = S0 * (1 + R0 + a1 * R0 * R0 + a2 * R0 * R0 * R0 + Z);
{ Z in [-55b-39,55b-39] /\ S - S0 in [-1b-41,1b-41] /\ R - R0 in [-1b-34,1b-34] /\
R in [0,0.0217] /\ n in [-10176,10176] ->
e in ? /\ e - E0 in ? }
e - E0 -> (e - E) + (Er - E0);
Gappa answers that the error is bounded by 0.535 ulp. This is consistent with the bounds computed by Tang and Harrison.
Results:
e in [8572295b-23 {1.0219, 2^(0.0312489)}, 4380173b-22 {1.04431, 2^(0.0625575)}]
e - E0 in [-75807082762648785b-80 {-6.27061e-08, -2^(-23.9268)}, 154166255364809243b-81 {6.37617e-08, 2^(-23.9027)}]
Fixed-point Newton division¶
The algorithm and its verification¶
Let us suppose we want to invert a floating-point number on a processor without a floating-point unit. The 24-bit mantissa has to be inverted from a value between 0.5 and 1 to a value between 1 and 2. For the sake of this example, the transformation is performed by Newton’s iteration with fixed-point arithmetic.
The mantissa is noted d
and its exact reciprocal is R
. Newton’s
iteration is started with a first approximation r0
taken from a table
containing reciprocals at precision \(\pm 2^{-8}\). Two iterations
are then performed. The result r1
of the first iteration is computed
on 16-bit words in order to speed up computations. The result r2
of
the second iteration is computed on full 32-bit words. We want to prove
that this second result is close enough to the infinitely precise
reciprocal R = 1/d
.
First, we define R
as the reciprocal, and d
and r0
as two
fixed-point numbers that are integer multiples of \(2^{-24}\)
and \(2^{-8}\) respectively. Moreover, r0
is an approximation of
R
and d
is between 0.5 and 1.
R = 1 / d;
{ @FIX(d,-24) /\ d in [0.5,1] /\
@FIX(r0,-8) /\ r0 - R in [-1b-8,1b-8] ->
... }
Next we have the two iterations. Gappa’s representation of fixed-point arithmetic is high-level: the tool is only interested in the weight of the least significant bit. The shifts that occur in an implementation only have an impact on the internal representation of the values, not on the values themselves.
r1 fixed<-14,dn>= r0 * (2 - fixed<-16,dn>(d) * r0);
r2 fixed<-30,dn>= r1 * (2 - d * r1);
The property we are looking for is a bound on the absolute error between
r2
and R
.
{ ... -> r2 - R in ? }
We expect Gappa to prove that r2
is \(R \pm 2^{-24}\).
Unfortunately, this is not the case.
Results:
r2 - R in [-1320985b-18 {-5.03916, -2^(2.33318)}, 42305669b-23 {5.04323, 2^(2.33435)}]
Adding hints¶
With the previous script, Gappa computes a range so wide for r2 -
R
that it is useless. This is not surprising: The tool does not know
what Newton’s iteration is. In particular, Gappa cannot guess that such
an iteration has a quadratic convergence. Testing for r1 - R
instead
does not give results any better.
Gappa does not find any useful relation between r1
and R
, as the
first one is a rounded multiplication while the second one is an exact
division. So, we split the absolute error into two terms: a
round-off error we expect Gappa to compute, and the convergence due to
Newton’s iteration.
{ ... ->
r1 - r0 * (2 - d * r0) in ? /\ r0 * (2 - d * r0) - R in ? }
Gappa now gives the following answer. Notice that the range of the round-off error almost matches the precision of the computations.
Results:
r1 - r0 * (2 - d * r0) in [-1b-14 {-6.10352e-05, -2^(-14)}, 788481b-32 {0.000183583, 2^(-12.4113)}]
r0 * (2 - d * r0) - R in [-131585b-16 {-2.00783, -2^(1.00564)}, 131969b-16 {2.01369, 2^(1.00984)}]
Notice that Gappa computes correct bounds for the round-off error, but not for the algorithmic one. We can help Gappa by making explicit the quadratic convergence of Newton’s iteration:
r0 * (2 - d * r0) - R -> (r0 - R) * (r0 - R) * -d;
r1 * (2 - d * r1) - R -> (r1 - R) * (r1 - R) * -d;
Gappa answers \(r_2 = R \pm 2^{-24.7}\).
Warning: the expression (d) has been assumed to be nonzero when checking a rewriting rule.
Results:
r2 - R in [-638882156545b-64 {-3.46339e-08, -2^(-24.7832)}, 32771b-44 {1.86282e-09, 2^(-28.9999)}]
While the answer is the expected one, there is this warning message
about d
possibly being zero. Indeed, the correctness of the rules
relies on the fact that R * d = 1
, which would not hold if d
is zero. In order to silence this warning, we indicate at the rightmost
end of the rewriting rules that they can be used only when Gappa can
prove that d
is not zero.
r0 * (2 - d * r0) - R -> (r0 - R) * (r0 - R) * -d { d <> 0 };
When generating a script for an external proof checker, Gappa will add
this rewriting rule as a global hypothesis. For example, when selecting
the Coq back-end with the option -Bcoq
, the output contains the line
below.
Hypothesis a1 : (_d <> 0)%R -> r9 = r2.
In this hypothesis, _d
is the d
variable of the example, while
r9
and r2
are short notations for r0 * (2 - d * r0) - R
and
(r0 - R) * (r0 - R) * -d
respectively. In order to access the
generated proof, the user has to prove this hypothesis, which can be
trivially done with Coq’s field
tactic.
Full listing¶
R = 1 / d;
r1 fixed<-14,dn>= r0 * (2 - fixed<-16,dn>(d) * r0);
r2 fixed<-30,dn>= r1 * (2 - d * r1);
{ @FIX(d,-24) /\ d in [0.5,1] /\
@FIX(r0,-8) /\ r0 - R in [-1b-8,1b-8] ->
r2 - R in ? }
r0 * (2 - d * r0) - R -> (r0 - R) * (r0 - R) * -d { d <> 0 };
r1 * (2 - d * r1) - R -> (r1 - R) * (r1 - R) * -d { d <> 0 };
The answer is the same as before, since Gappa easily proves that d
is not zero.
Results:
r2 - R in [-638882156545b-64 {-3.46339e-08, -2^(-24.7832)}, 32771b-44 {1.86282e-09, 2^(-28.9999)}]
Another example of a Newton iteration is given in Why and Gappa.
Fast nearbyint
¶
The problem at hand¶
Consider the following C code. While seemingly meaningless, this
function actually computes the integer nearest to the argument x
,
assuming the compiler did not perform any unsavory optimization.
double f(double x) { return x + 0x3p51 - 0x3p51; }
There are some constraints on the range of x
, though. Usually, one
requires x
to be small enough: \(|x| \le 2^{51}\). Let us
prove that the function indeed behaves correctly in that case.
Gappa statement¶
As before, we first give a shorter name to the rounding operator, then
we express that the input x
is representable as a binary64 number,
and finally we denote the computed value as y
:
@rnd = float<ieee_64,ne>;
x = rnd(x_);
y rnd= (x + 3b51) - 3b51;
We can now state that, for small values of x
, the result y
is
an integer (i.e., a multiple of \(2^0\)) and it is sufficiently
close to x
.
{ x in [-1b51,1b51] -> @FIX(y,0) /\ |y - x| <= 0.5 }
Gappa succeeds in proving the first consequent. But the second one
eludes it, as it does know how y
and x
are related.
To solve the issue, we can proceed the same way as before. We just
need to express the relation between x
and the rounding-free
version of y
. This can be done as follows.
(x + 3b51) - 3b51 - x -> 0;
Another way would be as follows.
(x + 3b51) - 3b51 -> x;
In both cases, the rewriting rule is trivial, but it is outside the scope of Gappa.
Full listing¶
@rnd = float<ieee_64,ne>;
x = rnd(x_);
y rnd= (x + 3b51) - 3b51;
{ x in [-1b51,1b51] -> @FIX(y,0) /\ |y - x| <= 0.5 }
(x + 3b51) - 3b51 - x -> 0;