I missed the second day of pre-CAV workshops and the tutorial day on account of a deadline, but managed to make time to attend talks on the first day of CAV. This is the second time I was attending CAV, having previously attended it in Edinburgh in 2010. Surprisingly, I think I probably read more CAV papers before 2010 than after. I assume that CAV regulars can make more informed comments on trends visible in CAV proceedings, but from my limited knowledge, I would say that there are very few papers appearing on hardware verification, and the number of papers on temporal-logic related topics also seems to be declining. That said, Day 1 of CAV included a fair amount of automata and temporal logic. I will briefly summarise the talks I remember from this day.
Deterministic Automata for the (F,G)-fragment of LTL. Jan Kretinsky and Javier Esparza
The construction of automata from a formal system, like a regular expression or a temporal logic is one of the most fundamental problems in computer science. Regular expressions compile into nondeterministic finite automata, and such automata are usually complemented when checking properties such as language inclusion, or universality. Complementation usually requires determinization, which in turn involves a blow-up that is exponential or worse. The worst-case complexity of these problems and the sophistication of the algorithms required is even worse for translating LTL to Buchi automata. A construction due to Safra is well known and studied in the literature and several authors have worked on improving it, or providing alternative, simpler determinization procedures.
A question driving current research, and this paper in particular is whether we can identify sub-logics of LTL which compile efficiently into deterministic Buchi automata. This paper shows that it is quite simple to construct a Buchi automaton for the fragment of LTL that contains globally (G) and finally (F) as the only modalities. I really enjoyed this part of the talk and the paper, primarily because I could follow it and appreciate the simplicity of the construction. The main idea behind the first construction is to use sets of sets of sub-formulae to represent states, rather than just sets of sub-formulae. Thus, non-determinism in the automaton can be pushed into disjunction of formulae. The final automaton may be doubly exponential in the size of the formula, but the construction is simple, so we avoid determinization. One important “trick” is that using formulae alone is not enough, but one needs to remember the previous transition made. In the author’s own words:
The reader might be surprised or even annoyed by the fact that the logical structure of the state space is not sufficient to keep enough information to decide whether a run ρ is accepting. In order to ensure this, we remember one-step history in the state. Why is that?
If you cannot sleep without the answer, I can recommend their paper, which I enjoyed reading. There was much more to this work including a study of Muller and Rabin automata, but I did zoned out during this bit, because I was trying to work out the construction myself.
The rest of the first session was devoted to LTL and synthesis. The second session of the day was concerned with learning-based techniques and termination, and even their combination.
Interpolants as Classifiers. Rahul Sharma, Aditya Nori and Alex Aiken
In 2003, Ken McMillan showed that by using Craig interpolation algorithms, one could derive a model checker using only a satisfiability solver. His first paper worked the idea out for propositional logic and finite-state transition systems and later papers extended the idea to first-order theories. There are several appealing features about interpolation. It allows the efficiency and reasoning power of a solver to be reused for model checking. Interpolation, as implemented by verifiers, overapproximates the result of existential quantifier elimination allowing for a form of abstraction. Finally, interpolation also includes heuristic generalisations, which is very important for discovering invariants in practice.
The last point connects to the paper above. I was happy to see ideas from machine learning applied to interpolation. From very, very far away, machine learning can be seen as comprising of algorithmic approaches to deriving generalisations from specific instances. In this sense it appears to be a good approach to address one of the important problems in program verification. In principle, I do not believe the machine learning has to provide interpolants, but rather provide some series of approximations that culminates in an inductive invariant. In the paper, the authors use support vector machines (SVMs) to generate interpolants.
There are several appealing specifics to their approach. The use of SVMs allows verification folks access to a lot of mature machine learning technology, so we have a practical win. Second, SVMs are driven by examples, and not by deductive reasoning, as is the case for solvers. This means that if there are non-linear functions in the definition of a problem, they do not directly affect the SVM because we only need values of the function at various points to run the SVM. While not covered in detail in the paper, this approach also has potential for greater support for non-linear reasoning, because machine learning techniques have been extended to non-linear functions.
Learning Theory Applied to Program Verification
There were two talks applying algorithmic learning theory to program verification . Algorithmic learning theory is quite different from machine learning in the techniques used and problems studied. The main intuition I have is along the same lines as that for applying machine learning. There are situations in which we would like to be able to generalise to a concept from a few examples. Techniques from learning theory can help here.
In Learning Boolean Functions Incrementally by Yu-Fang Chen and Bow-Yaw Wang, the goal is to use examples to learn Boolean functions. The intuition is that in problems like constructing and verifying predicate abstractions, we can reason about atomic predicates or conjunctions of atomic predicates within reasonable efficiency bounds, but reasoning about Boolean combinations of atomic predicates (which may be required to describe an inductive invariant) can be computationally expensive. This is where learning of Boolean functions enters the picture. By using learning techniques, we can conjecture Boolean combinations rather than compute them. If the conjecture is inaccurate, it has to be modified using the learning algorithm.
In Termination Analysis with Algorithmic Learning by Wonchan Lee, Bow-Yaw Wang and Kwangkeyn Yi, the similar intuition as above is applied to prove termination. Again, we have predicates and combinations of predicates are used to describe invariants. Or in this case, termination arguments via transition invariants. The way I understand it, the learning algorithm is a way to interact between a termination checker and an engine that conjectures termination arguments. It allows for automatic discovery of a combination of well-foundedness arguments that leads to a proof for the entire program.