Examples

A simple example to start from: x * (1 - x)

The C program

Let us analyze the following function.

float f(float x)
{
  assert(0 <= x && x <= 1);
  return x * (1 - x);
}

This function computes the value x * (1 - x) for an argument x between 0 and 1. The float type is meant to force the compiler to use IEEE-754 single precision floating-point numbers. We also assume that the default rounding mode is used: rounding to nearest number, with tie breaking to even.

The function returns the value \(x \otimes (1 \ominus x)\) instead of the ideal value \(x \cdot (1 - x)\) due to the limited precision of the computations. If we rule out the overflow possibility (floating-point numbers are limited, not only in precision, but also in range), the returned value is also \(\circ(x \cdot \circ(1 - x))\). This o function is a unary operator related to the floating-point format and the rounding mode used for the computations. This is the form Gappa works on.

First Gappa version

We first try to bound the result of the function. Knowing that x is in the interval [0,1], what is the enclosing interval of the function result? It can be expressed as an implication: If x is in [0,1], then the result is in … something. Since we do not want to enforce some specific bounds yet, we use a question mark instead of some explicit bounds.

The logical formula Gappa has to verify is enclosed between braces. The rounding operator is a unary function float<ieee_32, ne>. The result of this function is a real number that would fit in a IEEE-754 single precision (ieee_32) floating-point number (float), if there was no overflow. This number is potentially a subnormal number and it was obtained by rounding the argument of the rounding operator toward nearest even (ne).

The following Gappa script finds an interval such that the logical formula describing our previous function is true.

{ x in [0,1] -> float<ieee_32,ne>(x * float<ieee_32,ne>(1 - x)) in ? }

Gappa answers that the result is between 0 and 1. Without any help from the user, they are the best bounds Gappa is able to prove.

Results:
  float<24,-149,ne>(x * float<24,-149,ne>(1 - x)) in [0, 1]

Defining notations

Directly writing the completely expanded logical formula is fine for small formulas, but it may become tedious once the problem gets big enough. For this reason, notations can be defined to avoid repeating the same terms over and over. These notations are all written before the logical formula.

For example, if we want not only the resulting range of the function, but also the absolute error, we need to write the expression twice. Let us give it the name y.

y = float<ieee_32,ne>(x * float<ieee_32,ne>(1 - x));
{ x in [0,1] -> y in ? /\ y - x * (1 - x) in ? }

We can simplify the input a bit further by giving a name to the rounding operator too.

@rnd = float<ieee_32, ne>;
y = rnd(x * rnd(1 - x));
{ x in [0,1] -> y in ? /\ y - x * (1 - x) in ? }

These explicit rounding operators right in the middle of the expressions make it tedious to translate the initial C code. We can avoid this issue by writing the rounding operator before the equal sign.

@rnd = float<ieee_32, ne>;
y rnd= x * (1 - x);
{ x in [0,1] -> y in ? /\ y - x * (1 - x) in ? }

Please note that this implicit rounding operator only applies to the results of arithmetic operations. In particular, a rnd= b is not equivalent to a = rnd(b). It is equivalent to a = b.

Finally, we can also give a name to the infinitely precise result of the function to clearly show that both expressions have a similar arithmetic structure.

@rnd = float<ieee_32, ne>;
y rnd= x * (1 - x);
z = x * (1 - x);

{ x in [0,1] -> y in ? /\ y - z in ? }

On the script above, Gappa displays:

Results:
  y in [0, 1]
  y - z in [-1b-24 {-5.96046e-08, -2^(-24)}, 1b-24 {5.96046e-08, 2^(-24)}]

Gappa displays the bounds it has computed. Numbers enclosed in braces are approximations of the numbers on their left. These exact left numbers are written in decimal with a power-of-two exponent. The precise format will be described below.

Improved version

The bounds above are not as tight as they could actually be. Let us see how to expand Gappa’s search space in order for it to find better bounds. Not only Gappa will be able provide a proof of the optimal bounds for the result of the function, but it will also prove a tight interval on the computational absolute error.

Notations

x = rnd(xx);                           # x is a floating-point number
y rnd= x * (1 - x);                    # equivalent to y = rnd(x * rnd(1 - x))
z = x * (1 - x);

The syntax for notations is simple. The left-hand-side identifier is a name representing the expression on the right-hand side. Using one side or the other in the logical formula is strictly equivalent. Gappa will use the defined identifier when displaying the results and generating the proofs though, in order to improve their readability.

The second and third notations have already been presented. The first one defines x as the rounded value of a real number xx. In the previous example, we had not expressed this property of x, which is a floating-point number. This additional piece of information will help Gappa to improve the bound on the error bound. Without it, a theorem like Sterbenz’ lemma cannot apply to the 1 - x subtraction.

Logical formulas and numbers

{ x in [0,1] -> y in [0,0.25] /\ y - z in [-3b-27,3b-27] }

Numbers and bounds can be written either in the usual scientific decimal notation or by using a power-of-two exponent: 3b-27 means \(3 \cdot 2^{-27}\). Numbers can also be written with the C99 hexadecimal notation: 0x0.Cp-25 is another way to express the bound on the absolute error.

Hints

Although we have provided additional information through the definition of x as a rounded number, Gappa is not yet able to prove the formula. It needs another hint from the user.

z -> 0.25 - (x - 0.5) * (x - 0.5);     # x * (1 - x) == 1/4 - (x - 1/2)^2

This rewriting hint indicates to Gappa that, when bounding the left-hand side, it can use an enclosure of the right-hand side. Please note that this rewriting rule only applies when Gappa tries to bound the left-hand side, not when it tries to bound a larger expression that contains the left-hand side as a sub-expression.

In some cases, Gappa might also need to perform a case split to prove the proposition. On this specific example, the user does not have to provide the corresponding hint because the tool automatically guesses that it should split the enclosure of x into smaller sub-intervals. If Gappa had failed to do so, the following hint would have to be added:

y, y - z $ x;    # not needed, as Gappa already guessed it

It indicates that Gappa should split the enclosure of x until all the enclosures pertaining to y and y - z in the proposition have been proved.

Full listing

To conclude, here is the full listing of this example.

# some notations
@rnd = float<ieee_32, ne>;
x = rnd(xx);                           # x is a floating-point number
y rnd= x * (1 - x);                    # equivalent to y = rnd(x * rnd(1 - x))
z = x * (1 - x);                       # the value we want to approximate

# the logical proposition
{ x in [0,1] -> y in [0,0.25] /\ y - z in [-3b-27,3b-27] }

# hints
z -> 0.25 - (x - 0.5) * (x - 0.5);     # x * (1 - x) == 1/4 - (x - 1/2)^2

Since Gappa succeeded in proving the whole proposition and there was no unspecified range in it, the tool does not display anything.

Tang’s exponential function

The algorithm

In Table-Driven Implementation of the Exponential Function in IEEE Floating-Point Arithmetic, Ping Tak Peter Tang described an implementation of an almost correctly-rounded elementary function in single and double precision. John Harrison later did a complete formal proof in HOL Light of the implementation in Floating point verification in HOL Light: the exponential function.

Here we just focus on the tedious computation of the rounding error. We consider neither the argument reduction nor the reconstruction part (trivial anyway, excepted when the end result is subnormal). We want to prove that, in the C code below, the absolute error between e and the exponential E0 of R0 (the ideal reduced argument) is less than 0.54 ulp. Variable n is an integer and all the other variables are single-precision floating-point numbers.

r2 = -n * l2;
r = r1 + r2;
q = r * r * (a1 + r * a2);
p = r1 + (r2 + q);
s = s1 + s2;
e = s1 + (s2 + s * p);

Let us note R the computed reduced argument and S the stored value of the ideal constant S0. There are 32 such constants. For the sake of simplicity, we only consider the second constant: \(2^{\frac{1}{32}}\). E is the value of the expression e computed with an infinitely precise arithmetic. Z is the absolute error between the polynomial \(x + a_1 \cdot x^2 + a_2 \cdot x^3\) and \(\exp(x) - 1\) for \(|x| \le \frac{\log 2}{64}\).

Gappa description

@rnd = float<ieee_32, ne>;

a1 = 8388676b-24;
a2 = 11184876b-26;
l2 = 12566158b-48;
s1 = 8572288b-23;
s2 = 13833605b-44;

r2 rnd= -n * l2;
r rnd= r1 + r2;
q rnd= r * r * (a1 + r * a2);
p rnd= r1 + (r2 + q);
s rnd= s1 + s2;
e rnd= s1 + (s2 + s * p);

R = r1 + r2;
S = s1 + s2;

E0 = S0 * (1 + R0 + a1 * R0 * R0 + a2 * R0 * R0 * R0 + Z);

{ Z in [-55b-39,55b-39] /\ S - S0 in [-1b-41,1b-41] /\ R - R0 in [-1b-34,1b-34] /\
  R in [0,0.0217] /\ n in [-10176,10176]
   ->
  e in ? /\ e - E0 in ? }

Please note the way Z is introduced. Its definition is backward: Z is a real number such that E0 is the product of S0 and the exponential of R0. It makes for clearer definitions and it avoids having to deal with divisions.

Results:
  e in [4282253b-22 {1.02097, 2^(0.0299396)}, 8768135b-23 {1.04524, 2^(0.0638374)}]
  e - E0 in [-13458043620277891b-59 {-0.023346, -2^(-5.42068)}, 3364512538651833b-57 {0.023346, 2^(-5.42068)}]

Gappa is able to bound both expressions. While the bounds for e seem sensible, the bounds for e - E0 are grossly overestimated. This overestimation comes from the difference between the structures of e and E0. To improve the bounds on the error e - E0, we split it into two parts: a round-off error and a term that combines both discretization and truncation errors. The round-off error is expressed by introducing a term E with the same structure as e but without any rounding operator.

E = s1 + (s2 + S * (r1 + (r2 + R * R * (a1 + R * a2))));

The round-off error is e - E, while the other term is E - E0. As before, the expressions E and E0 are structurally different, so Gappa grossly overestimates the bounds of E - E0. This time, we introduce a term Er with the same structure as E0 but equal to E. Since Z has no corresponding term in E, we insert an artificial term 0 in Er to obtain the same structure.

Er = S * (1 + R + a1 * R * R + a2 * R * R * R + 0);

Finally, we tell Gappa how to compute e - E0 using E and Er.

e - E0 -> (e - E) + (Er - E0);

Note that, rather than using a hint, this equality could also have been indicated as a hypothesis of the logical formula.

Full listing

@rnd = float< ieee_32, ne >;

a1 = 8388676b-24;
a2 = 11184876b-26;
l2 = 12566158b-48;
s1 = 8572288b-23;
s2 = 13833605b-44;

r2 rnd= -n * l2;
r rnd= r1 + r2;
q rnd= r * r * (a1 + r * a2);
p rnd= r1 + (r2 + q);
s rnd= s1 + s2;
e rnd= s1 + (s2 + s * p);

R = r1 + r2;
S = s1 + s2;

E = s1 + (s2 + S * (r1 + (r2 + R * R * (a1 + R * a2))));
Er = S * (1 + R + a1 * R * R + a2 * R * R * R + 0);
E0 = S0 * (1 + R0 + a1 * R0 * R0 + a2 * R0 * R0 * R0 + Z);

{ Z in [-55b-39,55b-39] /\ S - S0 in [-1b-41,1b-41] /\ R - R0 in [-1b-34,1b-34] /\
  R in [0,0.0217] /\ n in [-10176,10176] ->
  e in ? /\ e - E0 in ? }

e - E0 -> (e - E) + (Er - E0);

Gappa answers that the error is bounded by 0.535 ulp. This is consistent with the bounds computed by Tang and Harrison.

Results:
  e in [8572295b-23 {1.0219, 2^(0.0312489)}, 4380173b-22 {1.04431, 2^(0.0625575)}]
  e - E0 in [-75807082762648785b-80 {-6.27061e-08, -2^(-23.9268)}, 154166255364809243b-81 {6.37617e-08, 2^(-23.9027)}]

Fixed-point Newton division

The algorithm and its verification

Let us suppose we want to invert a floating-point number on a processor without a floating-point unit. The 24-bit mantissa has to be inverted from a value between 0.5 and 1 to a value between 1 and 2. For the sake of this example, the transformation is performed by Newton’s iteration with fixed-point arithmetic.

The mantissa is noted d and its exact reciprocal is R. Newton’s iteration is started with a first approximation r0 taken from a table containing reciprocals at precision \(\pm 2^{-8}\). Two iterations are then performed. The result r1 of the first iteration is computed on 16-bit words in order to speed up computations. The result r2 of the second iteration is computed on full 32-bit words. We want to prove that this second result is close enough to the infinitely precise reciprocal R = 1/d.

First, we define R as the reciprocal, and d and r0 as two fixed-point numbers that are integer multiples of \(2^{-24}\) and \(2^{-8}\) respectively. Moreover, r0 is an approximation of R and d is between 0.5 and 1.

R = 1 / d;

{ @FIX(d,-24) /\ d in [0.5,1] /\
  @FIX(r0,-8) /\ r0 - R in [-1b-8,1b-8] ->
  ... }

Next we have the two iterations. Gappa’s representation of fixed-point arithmetic is high-level: the tool is only interested in the weight of the least significant bit. The shifts that occur in an implementation only have an impact on the internal representation of the values, not on the values themselves.

r1 fixed<-14,dn>= r0 * (2 - fixed<-16,dn>(d) * r0);
r2 fixed<-30,dn>= r1 * (2 - d * r1);

The property we are looking for is a bound on the absolute error between r2 and R.

{ ... -> r2 - R in ? }

We expect Gappa to prove that r2 is \(R \pm 2^{-24}\). Unfortunately, this is not the case.

Results:
  r2 - R in [-1320985b-18 {-5.03916, -2^(2.33318)}, 42305669b-23 {5.04323, 2^(2.33435)}]

Adding hints

With the previous script, Gappa computes a range so wide for r2 - R that it is useless. This is not surprising: The tool does not know what Newton’s iteration is. In particular, Gappa cannot guess that such an iteration has a quadratic convergence. Testing for r1 - R instead does not give results any better.

Gappa does not find any useful relation between r1 and R, as the first one is a rounded multiplication while the second one is an exact division. So, we split the absolute error into two terms: a round-off error we expect Gappa to compute, and the convergence due to Newton’s iteration.

{ ... ->
  r1 - r0 * (2 - d * r0) in ? /\ r0 * (2 - d * r0) - R in ? }

Gappa now gives the following answer. Notice that the range of the round-off error almost matches the precision of the computations.

Results:
  r1 - r0 * (2 - d * r0) in [-1b-14 {-6.10352e-05, -2^(-14)}, 788481b-32 {0.000183583, 2^(-12.4113)}]
  r0 * (2 - d * r0) - R in [-131585b-16 {-2.00783, -2^(1.00564)}, 131969b-16 {2.01369, 2^(1.00984)}]

Notice that Gappa computes correct bounds for the round-off error, but not for the algorithmic one. We can help Gappa by making explicit the quadratic convergence of Newton’s iteration:

r0 * (2 - d * r0) - R -> (r0 - R) * (r0 - R) * -d;
r1 * (2 - d * r1) - R -> (r1 - R) * (r1 - R) * -d;

Gappa answers \(r_2 = R \pm 2^{-24.7}\).

Warning: the expression (d) has been assumed to be nonzero when checking a rewriting rule.
Results:
  r2 - R in [-638882156545b-64 {-3.46339e-08, -2^(-24.7832)}, 32771b-44 {1.86282e-09, 2^(-28.9999)}]

While the answer is the expected one, there is this warning message about d possibly being zero. Indeed, the correctness of the rules relies on the fact that R * d = 1, which would not hold if d is zero. In order to silence this warning, we indicate at the rightmost end of the rewriting rules that they can be used only when Gappa can prove that d is not zero.

r0 * (2 - d * r0) - R -> (r0 - R) * (r0 - R) * -d   { d <> 0 };

When generating a script for an external proof checker, Gappa will add this rewriting rule as a global hypothesis. For example, when selecting the Coq back-end with the option -Bcoq, the output contains the line below.

Hypothesis a1 : (_d <> 0)%R -> r9 = r2.

In this hypothesis, _d is the d variable of the example, while r9 and r2 are short notations for r0 * (2 - d * r0) - R and (r0 - R) * (r0 - R) * -d respectively. In order to access the generated proof, the user has to prove this hypothesis, which can be trivially done with Coq’s field tactic.

Full listing

R = 1 / d;

r1 fixed<-14,dn>= r0 * (2 - fixed<-16,dn>(d) * r0);
r2 fixed<-30,dn>= r1 * (2 - d * r1);

{ @FIX(d,-24) /\ d in [0.5,1] /\
  @FIX(r0,-8) /\ r0 - R in [-1b-8,1b-8] ->
  r2 - R in ? }

r0 * (2 - d * r0) - R -> (r0 - R) * (r0 - R) * -d   { d <> 0 };
r1 * (2 - d * r1) - R -> (r1 - R) * (r1 - R) * -d   { d <> 0 };

The answer is the same as before, since Gappa easily proves that d is not zero.

Results:
  r2 - R in [-638882156545b-64 {-3.46339e-08, -2^(-24.7832)}, 32771b-44 {1.86282e-09, 2^(-28.9999)}]

Another example of a Newton iteration is given in Why and Gappa.

Fast nearbyint

The problem at hand

Consider the following C code. While seemingly meaningless, this function actually computes the integer nearest to the argument x, assuming the compiler did not perform any unsavory optimization.

double f(double x) { return x + 0x3p51 - 0x3p51; }

There are some constraints on the range of x, though. Usually, one requires x to be small enough: \(|x| \le 2^{51}\). Let us prove that the function indeed behaves correctly in that case.

Gappa statement

As before, we first give a shorter name to the rounding operator, then we express that the input x is representable as a binary64 number, and finally we denote the computed value as y:

@rnd = float<ieee_64,ne>;
x = rnd(x_);
y rnd= (x + 3b51) - 3b51;

We can now state that, for small values of x, the result y is an integer (i.e., a multiple of \(2^0\)) and it is sufficiently close to x.

{ x in [-1b51,1b51] -> @FIX(y,0) /\ |y - x| <= 0.5 }

Gappa succeeds in proving the first consequent. But the second one eludes it, as it does know how y and x are related.

To solve the issue, we can proceed the same way as before. We just need to express the relation between x and the rounding-free version of y. This can be done as follows.

(x + 3b51) - 3b51 - x -> 0;

Another way would be as follows.

(x + 3b51) - 3b51 -> x;

In both cases, the rewriting rule is trivial, but it is outside the scope of Gappa.

Full listing

@rnd = float<ieee_64,ne>;

x = rnd(x_);
y rnd= (x + 3b51) - 3b51;

{ x in [-1b51,1b51] -> @FIX(y,0) /\ |y - x| <= 0.5 }

(x + 3b51) - 3b51 - x -> 0;