This is the code base for the Basilisk prototype, as described in the OSDI'25 paper "Basilisk: Using Provenance Invariants to Automate Proofs of Undecidable Protocols".
The paper builds on the concept of an Invariant Taxonomy introduced in Kondo, and factors the invariants of a distributed protocol into various sub-classes of Regular Invariants. Given a distributed protocol, Basilisk automatically generates a set of Regular invariants that is strong enough to form the inductive invariant of the protocol.
A copy of this repository is also hosted on Zenodo (link).
This artifact has two main directories.
The directory local-dafny/ contains the source code and executable for the Basilisk tool. It is built on top of the Kondo artifact, which itself is a fork of Dafny 4.2.0.
The core Basilisk functionality is implemented in the local-dafny/Source/DafnyCore/Basilisk/ sub-directory.
The directory basilisk/ contains the protocols on which Basilisk is evaluated, together with the scripts for performing the evaluation.
The directory kondo/ contains the protocols on which Kondo is evaluated, as a reference.
The following instructions have been tested on an M3 MacBook Pro running MacOS Sequoia. Libraries and commands for other OS may differ.
We begin with building our local version of Dafny that contains Basilisk extensions.
-
Dependencies:
- Install .NET SDK
- Note that version 6.0 is required: https://dotnet.microsoft.com/en-us/download/dotnet/6.0
- Install z3:
- This can be done using brew:
brew install z3.
- This can be done using brew:
- Python 3 and pip3 are needed but they are likely already part of your Mac installation.
- Install .NET SDK
-
Build Dafny. From the project root, run:
cd local-dafny makeThis should take less than 20 seconds. Note that you may also be required to install Java 17 as a dependency for gradle.
-
To check that Dafny runs as expected, run from the local-dafny directory:
./Scripts/dafny /compile:0 test.dfy
The expected output is
Dafny program verifier finished with 1 verified, 0 errors
Now that Dafny is set up, we check that all 16 protocols in our evaluation passes the Dafny verifier. From the repository root, run:
cd basilisk/
./verify-all This script runs the Dafny verifier on each of the protocol. It takes about 5 minutes on an M3 MacBook Pro, while consuming up to 1GB of RAM.
Note that warnings of the form Warning: /!\ No terms found to trigger on. can be safely ignored.
Important
This step must be completed before moving on to the Paper Evaluation section, as it produces auto-generated files required to complete the evaluation.
One may also verify each version of each protocol individually, by running the verify script in the respective sub-directory for the protocol version. For instance:
./basilisk/clientServer/automate_gen2/verifyThis section involves reproducing the results published in the paper.
Basilisk's goal is to relieve developer effort in verifying distributed systems. It accomplishes this on two fronts:
- The user only needs to provide a few simple hints for Basilisk to automatically generate an inductive invariant for each protocol.
- The user should be responsible for writing fewer lines of proof code, compared to its predecessor Kondo.
The next two section detail how we obtain these numbers.
Before proceeding, please first run:
cd basilisk/
./verify-all
to generate all of the required files.
Claim: User only needs to provide a few simple hints for Basilisk to automatically generate an inductive invariant for each protocol.
There are two types of hints, namely monotonic type annotations, and Provenance hints, both listed in Table 1 of the paper. These numbers are obtained through manual inspection. Using only these hints, basilisk generates invariants sufficient to form the inductive invariants of the evaluated protocols.
For monotonic type annotations, we count the number of monotonic variables in the host state of each protocol, i.e., types that begin with the Monotonic prefix.
These are located within the Variables datatype definition within each host.dfy file.
For example, we can look at the Variables definition in basilisk/paxos/automate_gen2/hosts.dfy, to see that 5 monotonic variables are used.
Note that Basilisk's use of monotonic variables is identical to Kondo, and hence Basilisk shares the same numbers as Kondo.
For provenance hints, we count the number of predicates defined in the customMessageInvariants.dfy file for protocol (e.g., basilisk/multiPaxos/automate_gen2/customMessageInvariants.dfy). Note that only protocols that actually require such hints have such a file in their directory.
The expected counts are:
| Monotonicity annotations |
Provenance hints |
|
|---|---|---|
| Echo Server | 1 | 0 |
| Ring Leader Election | 0 | 0 |
| Simplified Leader Election | 1 | 0 |
| Paxos | 5 | 0 |
| Paxos-combined | 5 | 0 |
| Paxos-dynamic | 4 | 2 |
| Flexible Paxos | 5 | 0 |
| Distributed Lock | 0 | 0 |
| ShardedKV | 0 | 0 |
| ShardedKV-Bactched | 0 | 0 |
| Lock Server | 0 | 0 |
| Two-Phase Commit | 3 | 0 |
| Three-Phase Commit | 5 | 0 |
| Reduce | 0 | 0 |
| Raft Leader Election | 1 | 2 |
| Multi-Paxos | 4 | 2 |
In contrast, Kondo's use of "Recv hints" is derived from inspecting Kondo's codebase.
In Kondo, these are predicates in the hosts.dfy file that end with the suffix "trigger" (e.g., predicate ReceivePromiseTrigger in this file).
To demonstrate that we prove protocols using only Basilisk's generated invariants as the inductive invariant, we can study the "verify" script located in each protocol's directory (e.g., basilisk/multiPaxos/automate_gen2/verify). In the script:
- Command Basilisk to generate a set of footprints using the Atomic Sharding Algorithm described in section 4 of the paper. This generates the file footprintsAutogen.json in the same directory as the verify script.
- Command Basilisk to use the generated json file, together with the protocol definition, to generate a set of Regular Invariants, in the files messageInvariantsAutogen.dfy, monotonicityInvariantsAutogen.dfy, and ownershipInvariantsAutogen.dfy (when ownership invariants are applicable).
- Verify the applicationProofDemo.dfy file. This file is what users of Basilisk writes. It defines the inductive invariant as the conjunct of all generated Regular Invariants, and uses it to prove the protocol, without defining any additional invariants.
Claim: Users write fewer lines of proof code using Basilisk, compared to Kondo.
This claim is evaluated by the "Lines of proof" columns in Table 1 of the paper.
The lines of proof code a user writes when using Basilisk is contained in the respective basilisk/<protocol>/automate_gen2/applicationProofDemo.dfy files.
To compare with Kondo, we include the Kondo protocols under the /kondo directory. For Kondo, the number of lines of "Safety proof" is the lines of code in kondo/<protocol>/sync/applicationProof.dfy, minus the lines for invariant definitions. Meanwhile, the "Total" proof is the sum of the "Safety" column and "Mods" column in Table 1 of the Kondo paper.
To generate these numbers, simply run
cd basilisk/evaluation
python3 eval.pyThe results are written to the sloc.csv file, and should match the numbers in Table 1 of the paper. These numbers are:
| Basilisk safety |
Basilisk total |
Kondo safety |
Kondo total |
|
|---|---|---|---|---|
| Echo Server | 8 | 49 | 0 | 40 |
| Ring Leader Election | 80 | 115 | 30 | 63 |
| Simplified Leader Election | 45 | 87 | 33 | 94 |
| Paxos | 444 | 486 | 558 | 777 |
| Paxos-combined | 445 | 497 | - | - |
| Paxos-dynamic | 479 | 522 | - | - |
| Flexible Paxos | 441 | 483 | 559 | 780 |
| Distributed Lock | 0 | 39 | 0 | 31 |
| ShardedKV | 0 | 39 | 35 | 68 |
| ShardedKV-Bactched | 0 | 38 | 0 | 31 |
| Lock Server | 0 | 38 | 20 | 59 |
| Two-Phase Commit | 95 | 139 | 119 | 186 |
| Three-Phase Commit | 108 | 152 | - | - |
| Reduce | 30 | 69 | - | - |
| Raft Leader Election | 52 | 93 | - | - |
| Multi-Paxos | 522 | 565 | - | - |