Hobbit (Higher-Order Bounded Bisimulation Tool) is a higher-order contextual equivalence checking tool that combines Bounded Model Checking and Up-To Techniques to verify equivalences and find inequivalences.
Publication DOI: 10.1007/978-3-030-99527-0_10
Venue: TACAS 2022
Directories present under the main Hobbit directory:
experiments: experimental data as recorded for the submitted paper, also attached hereprograms: our testsuite, which is divided into subdirectories for equivalencesequiv, inequivalencesinequiv, and debuggingmiscsrc: source files of Hobbit
Instructions below were tested for Linux and macOS. opam is not yet officially supported for Windows.
Dependencies:
- OCaml 4.08+ with
ocamlbuild - Opam
- Menhir
- Z3 with OCaml bindings
All dependencies are obtainable through OCaml's official package manager opam. Installation of opam is system specific so you may need to refer to their website linked above. Instructions for some popular systems are listed below:
add-apt-repository ppa:avsm/ppa
apt update
apt install opam
dnf install opam
# Homebrew
brew install gpatch
brew install opam
# MacPort
port install opam
opam needs to be set up after installation, and this may be system dependent. First one needs to initialise it:
opam init
After initialisation, one has to create the switch to a specific compiler. Any version 4.08 and over works. The command below uses 4.08.1, but one can use the latest version listed.
opam switch create 4.08.1
If this does not work, it could be because opam is missing a dependency. Depending on how minimal the installation of the system is, one may have to install many dependencies. You may need to check the error messages to know what is missing. In our experience, these are the dependencies typically missing e.g. for Ubuntu:
apt install build-essential
apt install gcc
apt install bubblewrap
The above correspond to make, gcc and bwrap, which are often missing in fresh installations.
Finally, initialising opam updates your ~/.profile file, so you may have to source it on e.g. Ubuntu:
source ~/.profile
With opam set up, installing Hobbit's dependencies becomes very easy.
opam install ocamlbuild
opam install menhir
opam install z3
Note that Z3 takes a long time to install.
With all dependencies installed and ~/.profile sourced, call the make file:
make
This produces a hobbit.native binary.
Hobbit inputs a file containing two program expressions, written in Hobbit’s input language which is based on ML. Hobbit attempts to (dis-)prove that the expressions are contextually equivalent. That is, replacing one for another in any possible program context does not change the observable behaviour of the program. See the paper for more details on contextual equivalence.
The primary options necessary for usage are:
-i <input file path>: selects a file to check; otherwise, Hobbit expects input viastdin-b <integer bound>: number of function applications Hobbit is allowed to perform before returning an "incoclusive" result to the user-u <string>: turns on up-to techniques; each character in the string provided turns on an up-to technique associated to that character; e.g. "ngsrialfzue" for [n]ormalisation [g]arbage-collection up-to-[s]eparation up-to-name-[r]euse up-to-[i]dentity sigma-g[a]rbage-collection sigma-norma[l]isation sigma-simpli[f]ication generali[z]ation remove-gamma-d[u]plicates up-to-re[e]ntry (default: ngsrialfzue)
Advanced options:
-t <0|1>: 0 for BFS and 1 for DFS traversal-p <integer>: provides a maximum number of configurations pending to be explored in the frontier; upon exceeding this, the tool stops-m <integer>: provides a maximum number of configurations remembered; upon exceeding this number, the newest configuration added replaces the oldest one remembered
Developer options:
-d <string>: similar to-u, but turns on printing debug information depending on the characters present in the string-l <integer>: used to select other LTS implementation; unused in this version of the tool
For list of options available under each parameter, use --help.
Techniques presented in the paper:
- separation: up-to separation (see paper sec.5); prunes names from the knowledge environments if they only manipulate local references.
- generalisation: up-to state invariants (see paper sec.5); generalises store values using a given invariant
- reentry: up-to proponent function re-entry (see paper sec.7); prevents reentry of functions if the store has not changed between calls
Other Techniques, more standard techniques:
- normalisation: normalises configurations memoised
- identity: prunes path if the configurations checked for equivalence are structurally identical
- name reuse: reuses opponent names that are no longer in use
- remove gamma duplicates: removes duplicate names in the opponent knowledge environment (gamma) provided they occur in the same index on both sides
- garbage collection: garbage collects unused store locations in configurations
- sigma normalisation: normalises the symbolic environment (sigma)
- sigma simplification: runs simplification procedure over the symbolic environment (sigma)
- sigma garbage collection: removes constraints involving redundant variables in the symbolic environment (sigma)
Many example input files, also used in the experimental evaluation section of the paper, can be found in the ‘programs’ directory with the following subdirectories:
programs/equivexamples of equivalent expressions;programs/inequivexamples of inequivalent expressions;programs/misca number of examples used for sanity-checking Hobbit.
An input file has the structure:
<pgm expression> ||| <pgm expression>
when the top-level types of the program expressions can be completely inferred by Hobbit’s type inference system or
<pgm expression> |||_<type> <pgm expression>
when Hobbit's type inference cannot fully infer types and the user has to provide the types of the expressions. Both expressions should have the same type.
Use the -i <filename> option to select an input file. When omitting the -i option Hobbit will read from the standard input, which can be useful with shell piping. For example:
./hobbit.native -i programs/equiv/meyer-sieber-e1.bils
Hobbit is primarily a bounded model-checker for bisimulation equivalence over the transition system described in detail in the research paper. As such a bound must be provided to limit the exploration of the explored (potentially infinite) state-space. This is provided with the -b parameter. The default bound is 6, which is sufficient for many examples. However several examples require a larger bound. For example the following example requires a bound of 332 or more:
./hobbit.native -i programs/equiv/list-qsort-N6.bils -b 332
Hobbit uses memory to store the number of pending bisimulation states (pairs of program configurations) that are still to be explored and the number of previous bisimulation states memoised. The former are created by a single step of Hobbit's exploration approach (DFS or BFS) and exceeding this limit will result in a runtime error. The latter is used to detect already explored states which is crucial for Hobbit to avoid exploring the same states multiple times and terminate. Limits to pending and memoised states are provided with the command line options -p and -m, respectively. Defaults are
1000000 and 10000 respectively.
Hobbit outputs both in the standard output and the standard error. The latter is used when input files fail to parse/typecheck or when an unexpected runtime error occurs. Standard output contains information about the equivalence checking functionality. For example:
./hobbit.native -i programs/equiv/meyer-sieber-e1.bils
****************
Debug mode:
Input file: programs/equiv/meyer-sieber-e1.bils
LTS implementation: immutable list
Bound: 6
Traversal: BFS
Max Pending Confs: 1000000
Memoisation cache size: 10000
Up-to techniques: ngsrialfzue
****************
END! Nothing more to explore; no difference was found between the terms with bound = 6. The bound was not exceded - the terms are indeed equivalent!
In order, each line of this output informs the user of the following:
- Hobbit was not in debug mode (used during Hobbit's development);
- the input file;
- an implementation-specific detail about the representation of the stack/queue used for pending states;
- the exploration traversal used;
- The limit in pending bisimulation states (pairs of configurations);
- the limit in memoised bisimulation states;
- the up-to techniques used in this run (see description below).
- The outcome of equivalence checking -- here the programs were indeed found to be equivalent.
The following example shows the output for an inequivalence example:
****************
Debug mode:
Input file: programs/inequiv/symb-const-4.bils
LTS implementation: immutable list
Bound: 6
Traversal: BFS
Max Pending Confs: 1000000
Memoisation cache size: 10000
Up-to techniques: ngsrialfzue
****************
Bisimulation failed! Failing trace:
-pr_ret _idx_1_->-op_call _idx_1_ ((ac9: Bool), (ac8: Bool))->-pr_ret _idx_1_->-op_call _idx_1_ (ac18: Bool)->-[ RHS=bot ]->-pr_ret (w17: Bool)->-[ only LHS terminates ]->
Symbolic Environment:
(w_17 <> true) and
(w_17 = (ac_9 = ac_8))
Satisfying Assignment:
(define-fun ac_9 () Bool
true)
(define-fun ac_8 () Bool
false)
(define-fun w_17 () Bool
false)
Here the user is informed that the program expressions were found to be inequivalent, together with a distinguishing trace in the Labelled Transition System described in the research paper, and the symbolic environment of the final configuration. More specifically this trace shows the sequence of transitions:
The trace above reports that the program expressions were inequivalent under the model {ac9 := true; ac8 := false; w17 := false}. The ac names are abstract constants provided by the opponent, while the w name is a symbolic constant resulting from evaluating a symbolic arithmetic expression. The discrepancy in the _ present in the names is due to a difference in how Z3 and Hobbit print names; internally, names ac_9 and ac9 are both the same abstract constant with index 9. Briefly, the trace reports:
-pr_ret _idx_1_->: both initial configurations performed a proponent-return transition, which returned a function to the opponent, this function was indexed under position 1 of an environment called Gamma in the research paper;-op_call _idx_1_ ((ac9: Bool), (ac8: Bool))->: the opponent calls this function given to it at1with the pair of constantsac9, ac8, and because the function at index1is never called again, it is garbage collected;-pr_ret _idx_1_->: proponent then returns another function, which takes up position1in Gamma which was just freed;-op_call _idx_1_ (ac18: Bool)->: the opponent then calls this new function at1with some new constantac18;-[ RHS=bot ]->-pr_ret (w17: Bool)->: at this point, the left-hand-side returns the result of an arithmetic expressionw17which the right-hand-side cannot match, resulting in the RHS becoming a specialbotconfiguration (RHS=bot) which by the rules of the labelled transition system cannot terminate;-[ only LHS terminates ]->: the opponent is able to observe termination of the left-hand-side only, thus verifying the inequivalence.
The symbolic environment printed here shows that in this distinguishing trace, the value w17 returned by the LHS, is false and was the result of ac_9 = ac_8. In other words the program expressions can be distinguished when the first opponent call provides a pair of unequal booleans, which the satisfying assignment produced by Z3 instantiates to ac9 = true and ac8 = false.
Experimental data used in the paper is provided in the directory experiments. There are two main sub-directories:
hobbit_experiments: this contains the output of therun-tests.shused for the table in Section 8 of the paper.syteci_experiments: this contains the programs used for comparing Hobbit with the related tool called SyTeCi, described in Section 9.
Directory structure for the experiments is as follows:
experiments
|
└───hobbit_experiments <experiments benchmarking Hobbit>
| | bfs_equiv.ods <spreadsheet recording all output for experiments for equivalences>
| | bfs_inequiv.ods <spreadsheet recording all output for experiments for inequivalences>
| | bfs_equiv.txt.csv <csv file for equivalences produced by processing all raw data using a script>
| | bfs_inequiv.txt.csv <csv file for inequivalences produced by processing all raw data using a script>
| | make_csv.sh <script we used to collect the raw data into csv files>
| |
| └───default <directory containing our logs for default mode>
| └───no_sep <directory containing our logs for no-separation mode>
| └───no_gen <directory containing our logs for no-invariants mode>
| └───no_reentry <directory containing our logs for no-reentry mode>
| └───no_all <directory containing our logs for all-up-to-techniques-off mode>
|
└───syteci_experiments <experiments to compare Hobbit and SyTeCi>
| table.ods <spreadsheet recording all results of the comparison experiments>
|
└───hobbit_progs
| |
| └───from_hobbit <programs chosen from Hobbit's testsuite in Hobbit syntax>
| | | run-tests.sh <copy of `run-tests.sh` added to aid reproduceability tests>
| | └───programs
| | └───equiv <equivalence programs from Hobbit's testsuite>
| | └───ineq <inequivalence programs from Hobbit's testsuite>
| |
| └───from_syteci <programs from SyTeCi's testsuite in Hobbit syntax>
| | ... <same directory structure as `from_hobbit`>
|
└───syteci_progs
|
└───from_hobbit <programs chosen from Hobbit's testsuite in SyTeCi syntax>
| └───eq <equivalence programs from Hobbit's testsuite>
| | | results.txt <text file with our results>
| | | scale.sh <script that produced results.txt, requires SyTeCi's binaries>
| |
| └───ineq <inequivalence programs from Hobbit's testsuite>
| | ... <same directory structure as `eq`>
|
└───from_syteci <programs from SyTeCi's testsuite in SyTeCi syntax>
| ... <same directory structure as `syteci_progs/from_hobbit`>
In the directories above, hobbit_experiments/bfs_equiv.ods and hobbit_experiments/bfs_inequiv.ods collect all benchmarking done on Hobbit, and syteci_experimenets/table.ods record all data from the comparison.
A script is provided to run our experiments as described in the paper.
Note that this script requires timeout from GNU Coreutils. All files checked are set to time out after 150s. Also note that the compiled binary hobbit.native needs to be in the main directory (with the script) for the script to work.
The script can be called as follows:
bash run-tests.sh
The script runs Hobbit on all files in the programs/ subdirectories in batches. Each batch corresponds to an experiment in which the tool is run with options specified in the paper. These are:
default: all up-to techniques onno_sep: up to separation offno_gen: up to invariants offno_reentry: up to reentrancy offno_all: all up-to techniques off
Every batch contains two sets of files: equivalences and inequivalences. On the terminal, at the bottom of a batch, you can see the result of that batch. e.g. for the default batch:
BFS Inequivalence Checks result in file: logs/default/out_bfs_inequiv.txt
*** 72 equivalences proved
*** 68 inequivalences proved
*** 33 inconclusive examples
*** 0 error(s) in programs
*** 173 files checked
The script produces log files with the following structure:
logs
|
└───default
| | files_bfs_equiv.txt <equivalences: files checked, one per line>
| | err_bfs_equiv.txt <equivalences: stderr stream, collects all the times and errors in every line; synchronised with the files_bfs_equiv.txt file>
| | out_bfs_equiv.txt <equivalences: stdout stream, collects standard output of the tool>
| | status_bfs_equiv.txt <equivalences: collects parsed exit status: yes,no,N/A,error; synchronised with the files_bfs_equiv.txt file>
| | files_bfs_inequiv.txt <inequivalences: files checked, ...>
| | err_bfs_inequiv.txt <inequivalences: stderr stream, ...>
| | out_bfs_inequiv.txt <inequivalences: stdout stream, ...>
| | status_bfs_inequiv.txt <inequivalences: parsed exit status, ...>
|
└───no_sep
| | ... <same structure as default>
|
└───no_gen
| | ... <same structure as default>
|
└───no_reentry
| | ... <same structure as default>
|
└───no_all
| | ... <same structure as default>
To relate the files, err and status messages use the paste command; e.g.:
paste logs/default/files_bfs_equiv.txt logs/default/err_bfs_equiv.txt logs/default/status_bfs_equiv.txt
As the script progresses, testing batches should slow down drammatically and error messages start popping up. These are expected timeouts and show that up-to techniques work. With all techniques turned off, Hobbit should fail to prove all but 3 trivial equivalences, and be very slow at proving inequivalences. You can verify the error messages are indeed timeouts by checking the logs, which shall be explained following this.
The script can output 1 of 4 messages for every file:
equivalent: the terms have been proven contextually equivalentinequivalent: the terms have been proven contextually inequivalentinconclusive: Hobbit reached the bound and could not prove equivalence or inequivalenceerror: Hobbit stopped unexpectedly, which can be for the following reasons:-
message is present in the log files: Hobbit crashed and the message details the reasons. None of the provided examples in our experiments produce this case.
-
message is not present in the log files: Hobbit was stopped by the script itself due to timeout, which is expected to happen after 150s. For example, after running the
no_sepbatch of experiments on an average laptop, the file/programs/equiv/hector-bsort-CN5-sorted.bilsis likely to timeout. One could then observe the following line in the combined log files with the command:paste logs/no_sep/files_bfs_equiv.txt logs/no_sep/err_bfs_equiv.txt logs/no_sep/status_bfs_equiv.txt | grep hector-bsort-CN5-sorted.bils hector-bsort-CN5-sorted.bils 150.080 errorshowing that this program timed out at 150s.
-
A copy of run-tests.sh is provided in the Hobbit directories to aid reproduceability tests for the Hobbit side of the comparison. To use the scripts, a copy of hobbit.native is required in the same directory as the scripts being run. After building Hobbit, one can copy the symbolic link in the main directory into syteci_experiments/hobbit_progs/ and syteci_experiments/. Alternatively, one could add hobbit.native in the _build directory produced to PATH environment. To run the SyTeCi side of the comparison, a copy of SyTeCi's binaries is needed.