smattr / rumur Goto Github PK

yet another model checker

License: The Unlicense

CMake 1.29% C++ 57.06% Lex 0.68% Yacc 1.63% C 10.96% MATLAB 4.83% M 6.38% Python 4.41% Roff 3.06% Shell 1.18% Emacs Lisp 0.11% SMT 3.88% Objective-C 4.22% Limbo 0.21% Vim Script 0.11%

model-checking model-checker

rumur's Introduction

Rumur

Rumur is a model checker, a formal verification tool for proving safety and security properties of systems represented as state machines. It is based on a previous tool, CMurphi, and intended to be close to a drop-in replacement. Rumur takes the same input format as CMurphi, the Murphi modelling language, with some extensions and generates a C program that implements a verifier.

A more extended introduction is available in doc/introduction.rst

Quickstart

Installation on Ubuntu or Debian

apt install rumur

Installation on FreeBSD

pkg install rumur

Thanks to yuri@FreeBSD for packaging.

Building from Source

First you will need to have the following dependencies installed:

Either GCC or Clang
Bison
CMake
Flex
Libgmp
Python ≥ 3.4

Then:

# Download Rumur
git clone https://github.com/Smattr/rumur
cd rumur

# Configure and compile
cmake -B build -S .
cmake --build build
cmake --install build

# Generate a checker
rumur my-model.m --output my-model.c

# Compile the checker (also pass -mcx16 if using GCC on x86-64)
cc -std=c11 -march=native -O3 my-model.c -lpthread

# Run the checker
./a.out

Compilation produces several artefacts including the rumur binary itself:

rumur: Tool for translating a Murphi model into a program that implements a checker;
murphi2c: Tool for translating a Murphi model into C code for use in a simulator;
murphi2murphi: A preprocessor for Murphi models;
murphi2smv: Tool for translating a Murphi model into NuSMV input;
murphi2uclid: Tool for translating a Murphi model into Uclid5 input;
murphi2xml: Tool for emitting an XML representation of a Murphi model’s Abstract Syntax Tree;
librumur.a: A library for building your own Murphi model tools; and
include/rumur/: The API for the above library.

Comparison with CMurphi

If you are migrating from CMurphi, you can read a comparison between the two model checkers at doc/vs-cmurphi.rst.

Legal

Everything in this repository is in the public domain, under the terms of the Unlicense. For the full text, see LICENSE.

rumur's People

Contributors

Stargazers

Watchers

Forkers

svghadi lyknode errare-humanum-est civic-fv debian-janitor

rumur's Issues

properties, covers, etc

There are similarities between assertions, invariants and other currently-unsupported property categories like covers. It would be useful to allow a user to manipulate the category of any particular property at checker generation time. If each were given as a generic "property" I can imagine the following ways a user might want to use the property:

invariant/assertion. Globally this means "must be true after every transition." Locally this means "must be true now."
assumption. Globally this means "discard any state that ever makes this true." Locally this means "discard any state that encounters this as true."
cover. This means "must be hit at least once in exploration." This could include local covers that are simply true expressing that a given point must be reached.
disabled. Property is ignored.

We might also want to explore LTL-style properties.

auto-scalarsets

Once we have symmetry reduction support, it seems logical to implement auto-detection of range types that could be converted to scalarsets and do this by default.

records with duplicate fields

I think records with duplicate fields are currently incorrectly accepted. At least, duplicate state variables seem to be.

remove unused state variables

As an optimisation on generated models, we could remove state variables that are never used. This is useful for some advanced users.

non-freeing allocator

We're currently using the default allocator (new/delete on the "master" branch and malloc/free on the "c-checker2" branch). This allocation has overhead due to book-keeping the allocation size so it can be freed later on. In the case of allocated states, we never need to free them later. More accurately, we only ever need to free a single state that we will immediately then reallocate.

It should be possible to optimise memory usage by implementing our own allocator with no support for freeing memory; e.g. a simple bump-pointer allocator. This should be a valuable optimisation for larger models.

(The default allocators also have another secondary overhead of thread synchronisation. However, I don't think this is a major concern as most optimised stdlib implementations have thread-local arenas from which they allocate.)

larger initial set size

The current seen set begins at 8MB. This is extremely small considering you can now buy laptops with 128GB RAM. We should probably increase this in order to decrease the number of at-runtime expansions.

partial order reduction

rules in counterexample trace

There are two obvious discrepancies between the current counterexample trace and CMurphi's:

transition rules are not shown
whitespace in-between states and within

Both of these should be fixable with relative ease.

discrepancies in covered states

Rumur and CMurphi report different numbers of covered states for certain models. Looking at a tweaked version of german.m (turning both scalarsets into ranges and removing the rules "Store", "SendGntE", "RecvGntE" and both invariants) we end up with the following:

CMurphi: state size is 200 bits, 2274 states covered, 6336 rules fired
Rumur: state size 80 bits, 2730 states covered

It's unclear exactly what the trigger is and trying to minimise the model further brings their results back in line. It's not surprising we generate a smaller state size (though it is surprising how much smaller), but we certainly shouldn't be seeing a different number of total states.

unchecked math in array accesses and count/width arithmetic

There are some calculations which are currently done with silent type coercion and we could probably switch to the overflow builtins with assertions that we never overflow just to be safe. E.g. things like MAX - MIN + 1.

colour output

One of my wants for CMurphi was colour in the progress output indicating convergence or divergence. I.e. showing the states in the queue as green if shrinking and red (or yellow?) if expanding.

array access out of bounds

Consider an array of a trivial range:

x: array[0..1] of 0..1;

Accessing this via a numeric literal, x[1], somehow triggers an out of bounds read. This is visible as an assertion failure in read_bits. Currently not clear exactly why this happens.

Single-thread window for new states

One technique we could try for speeding up the multithreaded checker (related to #6) is to have a "window" array we initially store new states in on a single thread. Then after window expiry we insert all states in the array into the state set. A natural value for the window would be the number of states generated by a single pass through all the rules. There are (hopefully) two benefits to this approach:

We grab the state set lock once for the entire window, rather than once per state.
We are within the state set critical region less often.

These two related factors should hopefully result in us contending on the state set lock less.

suggest -march=native in compiler flags

Better hash

The current hash logic used for states is basically a placeholder. It should be pretty straightforward to replace it with something like xxhash.

extended operator support

Should be relatively straightforward to support things like bit shifting operators.

implement optional undefined semantics

We don't currently implement the additional value, undefined, for each primitive type. This concept itself is a bit of a mixed blessing. If you ever read undefined it's considered an error and explicitly setting a value to undefined is only useful to aid symmetry reduction. In practice some models would be better off without it and the extra state bits it induces.

My current plan is to add optional undefined semantics. You can set the default (whether a primitive has an undefined or not) via the command-line. This default will apply to all non-specific types. For a type that wants a non-default (wants an undefined value always or never), you can use nullable or non-nullable types:

type
  foo_t: 0 .. 1!; -- <- explicitly non-nullable (has no undefined value)
  bar_t: 0 .. 1?; -- <- explicitly nullable (has an undefined value)

This syntax is new, but I think can be implemented in such a way that it doesn't conflict with existing syntax.

opportunistic cmpxchg

We currently attempt to write to a slot with some variant on the following:

atomic_load
if NULL ?
  cmpxchg
else
  cmp ?
    ...
  ...

I haven't fully thought through the performance implications, but it seems like we could just attempt the cmpxchg upfront and see if it succeeds. This would bias us towards a hash table with low occupancy, but I suspect for most models this would result in a win. Worth profiling.

Fixed set capacity requires 64-bit addressing

On Linux, attempting to build a checker with a fixed 4GB state set induces relative memory references that exceed 32-bit addressing and needs to use one of the larger memory models (-mcmodel=large or, on older compilers, -mmodel=large). This is not a deal breaker but it's awkward, inefficient and unnecessary. We should be able to avoid this and optimise execution by using calloc.

drop width member of handle

The main interfaces for interactive with handles (handle_read, handle_write and handle_narrow) are all called from contexts where we statically know the handle width. We could optimise the structure by dropping the width member and specifying it as a parameter literal.

We should profile whether this is worth it because I do not know whether the compiler already notices it can do this transformation. In particular, I believe if we drop the width member the x86-64 ABI will let us pass the residual struct around in registers.

Investigate effect of using shifts instead of div/mod for set indexing

The set is always power-of-two-sized -- even in the face of runtime expansion -- but I highly doubt a compiler can deduce this. As a consequence, I suspect it emits expensive integer divide and modulo instructions when it could use simple shifts and subtract. In certain models this computation might end up contributing significantly to the overall runtime. We should profile the effect of explicitly shifting and subtracting instead on a large model.

sandbox checker

It looks like it would be worthwhile and not too onerous to sandbox the checker using whatever native mechanism is available (seccomp, Pledge, App Sandbox). The checker's code should not be sensitive or run anything controversial, but since it is being generated it seems safer to guard against accidental code injection this way.

BSD compatibility

I've aimed for this but not tested so far.

no division support

Through some sort of oversight, we don't seem to support the division operator, /.

Multithreaded optimisation

The current multithreaded checker performs pretty poorly. Checking tests/long-running2/model.m on a sample platform with 4 threads:

single threaded, unoptimised: 64 seconds
4 threads, unoptimised: 348 seconds
single threaded, optimised: 8 seconds
4 threads, optimised: 314 seconds

This is not surprising as I expected the initial implementation to be pretty terrible. The multithreaded checker uses a single mutex-protected queue and a single mutex-protected seen set. Execution is completely dominated by contention on these two mutexes.

For the queue, one possibility would be to switch to per-thread, mutex-protected queues with work stealing. A second option would be a multi-headed, lock-free queue. The multithreaded checker does not need to pop the queued states in any particular order, which gives us a lot of freedom.

For the seen set, it should be possible to implement a thread-safe, lock-free set.

Use native scalars of variable size

Currently we exclusively use int64_t scalars when dealing with unboxed types. It would be possible to actually use different types based on the bounds of the current value we're dealing with. This would allow us to support subsets of the range INT128_MIN to UINT128_MAX. This might result in a performance boost in some cases. I have not thought deeply about this.

machine readable output

A command line flag to switch to generating, e.g. XML, instead of human readable progress and cexes.

case sensitive identifiers

Rumur mistakenly treats Murphi as universally case-insensitive. In actual fact, only the keywords are case-insensitive. Identifiers are case-sensitive. E.g. i and I are different variables.

Quantifier loop can cause integer overflow

The increment part of the loop we emit for ruleset quantifiers can trigger integer overflow if the user has written a careless step size. We should safe guard this addition in the emitted code.

investigate fixed-size set containing pointers

The set we use when the user asks for a fixed capacity is currently an array of state values. It would be possible to instead use an array of state pointers. We can still preserve the capacity limits because we know the size of the state when allocating, so we can account for this in the allocation.

This will less effectively use memory (overhead of pointer storage) but would allow us to (a) use a more granular concurrency model with atomics instead of a global lock and (b) avoid the awkward calling convention where we currently modify the caller's pointer when inserting.

move --help text into a manpage

deadlock detection

why is ArrayValue a BitBlock?

While working on #8, I discovered ArrayValue inherits from BitBlock. Specifically, its contents are a packed byte array. I don't recall why I made this decision and it doesn't immediately seem logical now. This class represents local arrays and hence we may as well define it regularly and allow usual C++ struct layout to take over. This probably also enables the compiler to more easily optimise its uses.

failing an invariant results in exit with success

state allocation pool

Commit 533f202 effectively implemented a pool allocator with a single slot. If we proceed with generating a checker in C (currently in branch "c-checker2") it would make sense to reimplement some version of this as a thread-local allocator. I haven't profiled this, but my hunch is that it would have a significant effect in non-trivial models.

Remove StateBase vtables

Quoting something from the current sources:

          /* FIXME: This is somewhat awkward. We really want to just do
           * `s[slot] = *t` here. However, we got the memory backing `s` from
           * `calloc` so the vtables of the states were never initialised (to
           * non-null). Of course the assignment operator doesn't write to the
           * vtables because it thinks they've already been initialized. The
           * result is that using the assignment operator here appears to work
           * fine but then later operations that access the vtable segfault. As
           * you can imagine, this was not fun to debug. For now we call the
           * copy constructor to ensure we initialise the vtable, but I think a
           * better long term solution is to make StateBase non-virtual and
           * hence remove its vtable. To do this, we need something else to
           * inherit from BitBlock in place of StateBase and it in turn
           * reference a StateBase.
           */

Having StateBase participate in a class hierarchy is causing problems. Specifically, I didn't anticipate that all virtual calls cannot be collapsed at compile-time (e.g. a parameter to a function that is a reference to an integer that is not known to be in the state vector or in a local variable). Apart from the problem described above, it is highly undesirable for StateBase to have a vtable for allocation reasons. Checking can allocate millions or billions of these and the vtables add up.

We should refactor this area to pare back StateBase to something closer to a C struct with a few helpers.

distributed implementation

One of the long term goals is to move into the space of PReach and support a distributed checker. It looks like OpenMPI would be an ideal mechanism for this. There may be something one could learn from Eddy Murphi on this.

Duplicate resulting start states leak memory

It looks like a startstate that results in a duplicate (i.e. not inserted into the state set) is never freed.

symmetry reduction

DCLP-style multithreaded set addition

In relation to #6, another scheme to explore would be something in the style of double-checked locking. We already overload the state's previous pointer to indicate whether a slot is empty or not, so we could operate opportunistically on this perhaps get a win. This depends a lot on set contention and the model in question, so needs some experimentation with a broader corpus of models than we currently have.

array references cannot be assigned

The array copy assignment operator is deleted, which means we can't handle the code generated by expressions like:

a[0] := a[1];

TODO: thread-safe set expansion

The current set expansion logic is (known) not thread-safe. My original aim with the set was to implement something similar to what is described in Maier et al, "Concurrent Hash Tables: Fast and General(?)!" in arXiv 2016. This paper describes how to implement scalable, thread-safe expansion as well.

mcx16 hacks

While testing some recent experiments related to reference-counted pointers, I discovered a new GCC wrinkle. In recent versions of GCC, __atomic_compare_exchange* no longer emit a cmpxchg even when using -mcx16: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80878. If you trace this back, the reasoning is that emitting a cmpxchg is actually a bug because the "expected" pointer might point to read-only memory. cmpxchg writes to the "expected" pointer on success or failure, resulting in a segfault when it points to read-only memory.

I don't really follow this logic, because __atomic_compare_exchange* are already prototyped in the GCC docs as lacking const, so it you pass in a pointer to read-only memory you're violating the spec. But there is no arguing with the winds of fortune.

Anyway, the documented solution if you really want a C-level 128-bit cmpxchg is apparently to use the old school __sync_*_compare_and_swap. I don't understand how this is internally consistent because the same RO-memory arguments could apply to this function as well, but whatever. Given the axes of (1) compiler and (2) architecture, I think the safest choice is to:

#ifdef __x86_64__
  __sync_*_compare_and_swap
#else
  __atomic_compare_exchange*
#endif

The __sync builtins give no control over acquire/release semantics, but these have no bearing on x86-64 anyway, so this leaves us with the best of all worlds.

Further reading: https://gcc.gnu.org/ml/gcc/2017-01/msg00167.html

clean up semaphore implementation

The current semaphore implementation is split into two options -- Apple and non-Apple -- because Apple does not implement POSIX semaphores. This is suboptimal for a few reasons:

More code than desirable.
Possible discrepancies between the semantics of the two implementations.
Dependency on POSIX headers in addition to C++11 threads/atomics.

The semaphore is only used for initial jump start logic (ensuring the state queue is moderately full before releasing the herd of secondary threads), so we don't need it to be performant. We could hand roll a semaphore ourselves solely with C++11 atomics which would be cleaner.

c-checker: expansion fallback to realloc

The set expansion logic in the current "c-checker2" branch works by allocating a new set of twice the size of the existing set. This can fail if the total allocation (existing set + twice its size) would exceed our current limit, even though we only need the expanded set when we reach quiescence. It might be nice, if this fails, to fall back to a realloc based strategy. This would enable a sort of "last ditch effort" mode enabling the checking of models that only just fit in memory.

Parse CPP line directives

It's common for advanced CMurphi users to use the C pre-processor to parameterise models. To assist with this we could (a) make # a comment-starting character and (b) have support for interpreting line directives from CPP when giving users error messages.

No progress in optimised checker

As of 543626a, compiling german.m with GCC 7.1.0 with anything above -02 results in a checker that never seems to make any initial progress. I haven't yet debugged what the problem is here.

exhaustive scalarsets

It occurred to me that it should be possible for the checker to identify when symmetry reduction results in an upper range of a scalarset type being unnecessary. For example, the user may define a scalarset(10) but the checker may find that in practice any value above 5 is always symmetric to one less. Hence, the user could have just defined a scalarset(6).

Extending this idea, it should be possible for the checker to take an "infinite" scalarset and find a "useful" upper bound to it. Subsequent to having this idea, I read David Dill's "A Retrospective on Murϕ" and realised he had already had this idea. It does not look like this was ever implemented in a Murphi variant, but it certainly seems like a worthwhile goal for Rumur.

assignment does not validate that LHS and RHS have the same type

manpage probably inaccessible after install

The logic that tries to find and display the manpage for Rumur is assuming the directory structure of the build directory. This is not the same as the installation location. In fact, I don't think the manpage even gets installed currently. We should fix this as well as making more of an effort to locate the post-install manpage, including if the binary is called via $PATH.