Stupid Z3Py Tricks Strikes Back: Verifying a Keras Neural Network

Neural networks are all the rage these days. One mundane way of looking at neural networks is that they are a particular class of parametrized functions . What makes them useful is:

  1. They can be used at insane scale due to their simplicity and excellent available implementations and tooling
  2. There are intuitive ways to input abstract structure and symmetry expected of a problem, for example translation symmetry, or a hierarchy of small scale pattern recognition combining into large scale structures. How this all works is very peculiar.
  3. Inspirational analogies can be drawn from nature.

Neural networks made out of just relus (rectified linear units, relu(x) = max(0,x) ) and linear layers are particularly amenable to formal analysis. Regarding the weights as fixed (once a network has be trained), the complete neural network is a piecewise linear function. The regions where it is piecewise define are polyhedra (They are defined by the hyperplanes of the relu transitions as inequality constraints). Such functions are among those the most amenable to automated rigorous analysis.

A relu

Most machine learning tasks don’t have a mathematically precise specification. What is the mathematically precise definition of a picture of a horse? We could try to come up with something (this is sort of what good old fashioned AI tried to do), but in my opinion it would be rather suspect.

Tasks that do have a precise spec are questionable areas for machine learning techniques, because how can you know that the network meets the spec? Also, one would suspect such a problem would have structure that you might be better off with a more traditional algorithmic approach.

However, there a a couple areas where one does have reasonable formal questions one might want to ask about a neural network:

  • Robustness around training and validation data. Finding Adversarial examples or proving there are none.
  • Games like Go. Alpha Go is a marriage of more traditional algorithmic approaches and machine learning. There is a core of traditional game tree search to it.
  • Control Systems – There are many control systems which we do have a reasonable formal spec one could write, such as walking robots. These systems are so high dimensional that it is difficult to derive a good controller from the spec, and hence reinforcement learning may be a reasonable option. But we would like to confirm the controller is good and not dangerous
  • Neural networks as computational accelerators. There are problems which we know how to solve, but are too slow. Neural networks can be evaluated very quickly and easily thanks to modern frameworks. It may be useful to presolve a large number of examples offline using the slow algorithm and train a neural network to give good estimates. We may be able to replace the slow thing entirely if we can verify the neural network always is good enough.

We’re going to use a neural network to fit a chebyshev polynomial. Here we’re picking a Chebyshev polynomial as our base truth because Chebyshev polynomials have some pleasant waviness to them. Why not. I like ’em. Also polynomials are easily understood by z3 as a base spec.

This example of course is a complete toy. How often do you see 1-D input space neural networks? Not often I bet.
But it’s nice for a couple reasons:

  1. Because we can visualize it.
  2. It drives home the point about neural nets being a space of piecewise linear function approximators, and how similar training is to curve fitting.
  3. It’s simple enough that z3 can crush it. There is a big question if z3 or other methods can scale to realistic neural nets. Modern top of the line neural nets are insanely huge. As we’ve done it here, I highly doubt it. There are special purpose SMT solvers being built for this task. Also the slightly different technology of mixed integer programming can be used and seems very promising. So this is an area of research. See links at the bottom for more.

Generally speaking, the combination of the capabilities of sympy and z3 give us access to some very intriguing possibilities. I’m not going to investigate this in detail in this post, but I are showing how you can convert a sympy derived polynomial into a python expression using lambdify, which can then be in turn used on z3 variables.

Our actual chebyshev polynomial

Here we build a very small 3 layers network using Keras. We use a least squares error and an adam optimizer. Whatever. I actually had difficulty getting nice results out for higher order chebyshev polynomials. Neural networks are so fiddly.

Our neural fit of the polynomial

And here we extract the weights and reinterpret them into z3. We could also have used z3 floating point capabilities rather than reals if you’re concerned about numerical issues. It was convenient to have my layers be different sizes, so that size mismatch would throw a python error. That’s how I found out the weights are transposed by default. The code at the end extracts a found countermodel and evaluates it. If you want to feel fancier, you can also use the prove function rather than an explicit Solver() instance. Saying you proved the neural network matches the polynomial to a certain tolerance feels really good. If you look at the graphs, the edges at 1 and -1 actually have pretty bad absolute error, around 0.5.

Links – Evaluating Robustness of Neural Networks with Mixed Integer Programming – reluplex, SMT specifically for neural networks

More Stupid Z3Py Tricks: Simple Proofs

Z3 can be used for proofs. The input language isn’t anywhere near as powerful as interactive theorem provers like Coq, Isabelle, or Agda, but you can ask Z3 to prove pretty interesting things. Although the theorems that follow aren’t hard in interactive theorem provers, they would take beyond complete novice level skills to state or prove.

I like to think of the z3 proving process as “failing to find a counterexample”. Z3py has supplies a function prove which is implemented like this.

Basically, it negates the thing you want to prove. It then tries to find a way to instantiate the variables in the expression to make the statement false. If it comes back unsat, then there is no variable assignment that does it. Another way to think about this is rewriting the \forall y. p(y) as \neg \exists y \neg p (y). The first \neg lives at sort of a meta level, where we consider unsat as a success, but the inner \neg is the one appearing in s.add(Not(claim)).

We can prove some simple facts. This is still quite cool, let’s not get too jaded. Manually proving these things in Coq does suck (although is easy if you use the ring, psatz, and lra tactics, which you DEFINITELY should. It is a great irony of learning coq that you cut your teeth on theorems that you shouldn’t do by hand).

Ok, here’s our first sort of interesting example. Some properties of even and odd numbers. Even and Odd are natural predicates. What are possible choices to represent predictaes in z3?
We can either choose python functions IntSort -> BoolSort() as predicates or we can make internal z3 functions Function(IntSort(), BoolSort())

All well and good, but try to prove facts about the multiplicative properties of even and odd. Doesn’t go through. 🙁

Here’s a simple inductive proof. Z3 can do induction, but you sort of have to do it manually, or with a combinator. Given a predicate f, inductionNat returns

Here’s another cute and stupid trick. Z3 doesn’t have a built in sine or cosine. Perhaps you would want to look into dreal if you think you might be heavily looking into such things. However, sine and cosine are actually defined implicitly via a couple of their formula. So we can instantiate
A slightly counterintuitive thing is that we can’t use this to directly compute sine and cosine values. That would require returning a model, which would include a model of sine and cosine, which z3 cannot express.
However, we can try to assert false facts about sine and cosine and z3 can prove they are in fact unsatisfiable. In this way we can narrow down values by bisection guessing. This is very silly.

A trick that I like to use sometimes is embedding objects in numpy arrays. Numpy slicing is the best thing since sliced bread. A lot, but not all, of numpy operations come for free, like matrix multiply, dot, sum, indexing, slicing, reshaping. Only some are implemented in terms of overloadable operations. here we can prove the Cauchy Schwartz inequality for a particular vector and some axioms of vector spaces.

Defining and proving simple properties of Min and Max functions

Proving the Babylonian method for calculating square roots is getting close to the right answer. I like the to think of the Babylonian method very roughly this way: If your current guess is low for the square root x/guess is high. If your guess is high, x/guess is low. So if you take the average of the two, it seems plausible you’re closer to the real answer. We can also see that if you are precisely at the square root, (x/res + x)/2 stays the same. Part of the the trick here is that z3 can understand square roots directly as a specification. Also note because of python overloading, babylonian with work on regular numbers and symbolic z3 numbers. We can also prove that babylon_iter is a contractive, which is interesting in it’s own right.

A funny thing we can do is define interval arithmetic using z3 variables. Interval arithmetic is very cool. Checkout Moore’s book, it’s good. This might be a nice way of proving facts related to real analysis. Not sure.
This is funny because z3 internally uses interval arithmetic. So what we’re doing is either very idiotically circular or pleasantly self-similar.
We could use a similar arrangement to get complex numbers, which z3 does not natively support

Stupid Z3Py Tricks: Verifying Sorting Networks off of Wikipedia

Sorting networks are a circuit flavored take on sorting. Although you can build circuits for any size input, any particular circuit works for a fixed sized input. They are like an unrolling of the loops or recursion of more familiar sorting algorithms. They come up also in the context of parallel and gpu sorting

Here’s an interesting thing. We can go to Wikipedia and get a little python snippet for the comparison order of a Batcher odd-even mergesort. Kind of a confusing algorithm. Why does it even work? Is it even right? It’s written in some kind of funky, indexful generator style.

Source: Wikipedia

Well we can confirm this relatively straightforwardly using z3 by replacing the implementation of compare_and_swap with its z3 equivalent. We then ask z3 .

This comes back unsat, hence there are no inputs or executions that do not come back sorted. If I delete some elements from pair_to_compare, it comes back sat, showing that it doesn’t always sort.

The trick here is that the circuit is fixed size, so we have no need for induction, one of the main things z3 is rather finicky at.

It’s somewhat interesting to note that the output of odd_even_merge is a sequence of instructions, we can think of this as interpreting a very small 1 instruction programming language.

We can also confirm similarly a simple odd-even bubblesort and other similar algorithms.

Q: What about using uninterpreted sorts rather than integers? Integers is pretty convincing to me.

same_elems is slightly weaker than a permutation predicate. Wasn’t super obvious to me the best way to do a permutation predicate in z3. Would I want to internalize the array?

Edit: Upon further thought, actually the sort IS a nice predicate for permutation. How do we compute if two things are permutations of each other? By sorting them and forcing a zipped equality. Alternatively count the number of each element (a piece of bucket sort). Since this sort is done by composing swaps, it is somewhat intrinsically a permutation

As a bummer though, I think randomized testing on arrays would be equally or perhaps more convincing of the correctness of the algorithm. Oh well.

Programming and Interactive Proving With Z3Py

Fatal error: Uncaught Error: Call to a member function id() on array in /home/public/wordpress/wp-content/plugins/crayon-syntax-highlighter/crayon_formatter.class.php:36 Stack trace: #0 /home/public/wordpress/wp-content/plugins/crayon-syntax-highlighter/crayon_formatter.class.php(538): CrayonFormatter::format_code('', Array, Object(CrayonHighlighter)) #1 [internal function]: CrayonFormatter::delim_to_internal(Array) #2 /home/public/wordpress/wp-content/plugins/crayon-syntax-highlighter/crayon_formatter.class.php(516): preg_replace_callback('#()#msi', 'CrayonFormatter...', 'from z3 import ...') #3 /home/public/wordpress/wp-content/plugins/crayon-syntax-highlighter/crayon_highlighter.class.php(166): CrayonFormatter::format_mixed_code('from z3 import ...', Object(CrayonLang), Object(CrayonHighlighter)) #4 /home/public/wordpress/wp-content/plugins/crayon-syntax-highlighter/crayon_highlighter.class.php(186): CrayonHighlighter->process() #5 /home/public/wordpress/wp-content/plugins/crayon-syntax-highlighter/crayon_wp.class.php( in /home/public/wordpress/wp-content/plugins/crayon-syntax-highlighter/crayon_formatter.class.php on line 36