CAV 2012, Day 1: Thoughts on Numerical Static Analysis

Eep! I completely forgot the notes I typed out way back in 2012 about CAV and have neglected this blog for a while now.

The Gauge Domain: Scalable Analysis of Linear Inequality Invariants, Arnaud Venet

There is a rather curious algorithmic phenomenon that has a very direct impact on static analysis for numeric invariants. We can reason about equality and congruence relationships between variables somewhat efficiently. We can reason about inequalities concerning single variables somewhat efficiently. Efficient reasoning about inequalities over multiple variables is, however, not considered a solved problem.

There are intuitive reasons for the efficient solutions. When reasoning about equalities, the only property we have to worry about is transitivity, so a simple saturation routine comes to our aid. In the presence of uninterpreted functions, we also have to reason about congruences and the congruence closure algorithms still permits efficient reasoning. There are two  generalisations of the logic of equality with uninterpreted functions that are relevant and useful in practice.

One generalisation is to consider interpreted functions and a notion of equality tied to the interpretation. The functions may be operations on a field and the equality may be either equality on values or congruence modulo some value. If we have linear equations over the reals or the rationals, or a finite field, some variant of Gaussian elimination can be used to manipulate the equations in a manner that performs well in practice. I say perform well in practice to avoid confusion with polynomial worst-case complexity, because polynomial time for a single operation does not translate into efficient static analysis. There is   work on algorithmic complexity of static analysis procedures but that is a story for another post.

Another generalisation of equality with uninterpreted functions is to consider inequalities with monotone functions. Such constraints can be represented by directed graphs and decided using algorithms that operate on directed graphs. If we consider arithmetic functions, things quickly become complicated. As long as we limit ourselves to exactly one variable per inequality, we can reason about the constraints efficienty. The reason is that variables do not depend on each other so the bounds on different variables can be determined independently, and importantly, can be represented and manipulated independently. This independence has significant impact. If we move to two variables per inequality the picture is less clear. If the constraints can only have unit coefficients, intuition from graph algorithms leads to cubic algorithms. This is the technology underlying the Octagon abstract domain. Manipulating such constraints repeatedly inside a fixed point engine and achieving good performance for the entire analysis is challenging.

Sometimes two variables per inequality is not enough and equalities over arbitrary variables is insufficient as well. We really want to manipulate inequalities over multiple program variables. Even the well studied case of linear inequalities is still a major challenge. There has been work on this problem in recently times and Arnaud Venet’s paper advances the state of this problem. Just the introduction of his paper is a masterful survey of the state of the art. He also explains why techniques from linear programming do not suffice for the computations required for static analysis. Go read it!

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s