## Verifying pointer arithmetic

Today I’ll look at whether code that uses pointer arithmetic is any harder to verify than equivalent code that does not use pointer arithmetic.

Consider this function for copying an array (or part of an array) into another array (or part of another array):

voidarrayCopy(const int* src,int* dst, size_t num) { size_t i;for(i = 0; i < num; ++i) { dst[i] = src[i]; } }

## Aliasing and how to control it

Today I’ll start by writing a simple function that determines the maximum and minimum of two integers. We want to return two values, and C doesn’t make that easy unless we declare a **struct **to hold them. So I’ll pass two pointers to where I want the results stored instead. Here goes:

#include"arc.h"voidminMax(inta,intb,out int*min,out int*max)post(*min <= a; *min <= b; *min == a || *min == b)post(*max >= a; *max >= b; *max == a || *max == b) { *min = a < b ? a : b; *max = a > b ? a : b; }

I’ve highlighted the ArC annotations in green. Read more…

## Specification with Ghost Functions

In my previous post I showed that the C expression sublanguage extended with quantified expressions (**forall **and **exists**) is insufficient to allow some specifications to be expressed. I presented this function (annotated with an incomplete specification) to average an array of data:

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.lim == numReadings)pre(numReadings != 0)post(result== ?/* sum of elements of readings *//numReadings) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == ?/* sum of first i elements of readings */)decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/numReadings); }

## Expressing the Inexpressible

When writing preconditions, postconditions and other specifications for C programs, sometimes we need to write expressions that can’t be expressed in plain C. That’s why formal verification systems based on annotated programming languages almost always augment the expression sublanguage with **forall **and **exists **expressions. In previous posts, I’ve introduced ArC’s implementations of these. For example, the following expression yields **true **if all elements of the array *arr *are between 0 and 100 inclusive:

forallindinarr.indices :- arr[ind] >= 0 && arr[ind] <= 100

Here, *ind *is declared as a bound variable that ranges over the values in the expression that follows the keyword **in**, which in this case is all the indices into *arr*. Read more…

## Using constrained types in C

When writing critical software, one of the advantages cited for using Ada rather than C is that Ada lets you define constrained types, like this:

typePercentageisIntegerrange0 .. 100;

## Reasoning about null-terminated strings in C/C++

In my last post I described how ArC supports reasoning about array access, by allowing you to refer to the bounds of an array in a specification. If the code itself needs to know the size of an array, then the size is provided by other means, for example by passing it as an extra parameter. However, when using arrays of characters, standard practice in C is not to pass the number of elements, but to use a null character to indicate the end. Read more…

## Invariants for C/C++ Classes and Structs

In yesterday’s post, I proposed the use of simple C++ classes in critical software. I pointed out that classes are better than C structs, because they offer encapsulation and make it easier to avoid using objects that are not completely initialized. Now I’m going to point out another advantage of classes over structs, which is that they make it easier to enforce invariants.

Consider the following C code:

**typedef struct **_Limits {

**int **minValue;

**int **maxValue; * // must always be >= minValue*

} Limits;

The comment is an example of an *invariant*, i.e. a condition on the values of the members that we always expect to be true. During testing, we might want to do runtime checks to report any violation of the invariant. We would also like to do static analysis to make sure it always holds.

The problem with enforcing this invariant is that *minValue *and *maxValue *are public. This means that any piece if code that uses a variable of type *Limits *can break the invariant by assigning a new valie to *minValue *or *maxValue*. If we want to check the invariant at runtime, we must add a runtime check everywhere that the code assigns a value to either of these fields. Likewise, a static analyser must consider whether the invariant is broken at every place where one of these fields is assigned.

Let’s look at how we would define the *Limits *type using a C++ class instead:

**class **Limits {

**int **_minValue;

**int **_maxValue; // must always be >= minValue

**public**:

**int **minValue() **const **{ **return **_minValue; }

**int **maxValue() **const **{ **return **_maxValue; }

Limits(**int **n, **int **x)

: _minValue(n), _maxValue(x) {}

}

I’ve made the data private, and I’ve added a couple of functions to allow the min and max values to be read, but not written (don’t worry about whether this is efficient – any reasonable C++ compiler will inline calls to these functions). I’ve also added a constructor so that we can create values of type *Limits*. Using this new declaration of *Limits*, the only way that anyone can break the invariant is by calling the constructor with *n > x*. So there is just *one *place where we need to insert a runtime check to catch every instance where this invariant might be broken.

Finally, let’s look at what you need to do to get ArC to verify statically that the invariant always holds:

`#`

**include **"arc.h"

**class **Limits {

**int **_minValue;

**int **_maxValue;

**invariant**(_maxValue >= _minValue)

**public**:

**int **minValue() **const **{ **return **_minValue; }

**int **maxValue() **const **{ **return **_maxValue; }

Limits(**int **n, **int **x)

: _minValue(n), _maxValue(x) **pre**(x >= n) {}

}

Instead of expressing the invariant as a comment, we have expressed it using the **invariant **keyword. We #include “arc.h” at the start so that when you are compiling the file using a normal C++ compiler, **invariant**(…) is defined as a macro that expands to nothing. This makes the invariant invisible to the compiler. But when ArC sees the invariant, it know that it needs to prove that the invariant holds anywhere that we create or modify a value of type *Limits*.

Since the invariant only depends on private data, ArC only has to worry about breaking the invariant within the class’s own constructors and members. In order to prove that the *Limits* constructor satisfies the invariant, we need to ensure *x >= n* whenever it is called. That’s why I added the **pre**(x >= n) clause in the constructor. This clause tells ArC to assume *x >= n* when it verifies the constructor, and to verify x >= n whenever we call the constructor. **pre **is another ArC keyword – it stands for *precondition*.

Incidentally, although Microsoft’s Vcc doesn’t support any C++ (unlike ArC), it does allow you to declare invariants on structures. But when you want to initialize or modify such a structure, you’ll generally need to add some more annotations to “unwrap” and “wrap” it. That’s the price of not having encapsulation.