DiffKeeper: The Kubernetes Time Machine

DiffKeeper is a Black Box Flight Recorder for your containers.

It watches your application's filesystem in real-time and records every change. When a container crashes—or a CI test flakes—you can rewind the state to any exact moment and see exactly what happened.

Note: The earlier "stateful containers" design is archived. See Genesis & Pivot for the story.

The Problem: "Why did that test fail?"

You have a flaky test in CI. It fails 1 out of 50 times. You re-run the job, and it passes. You have no idea why.

Logs only show you what the application printed.
They don't show you that a config file was corrupted, a temp file was locked, or a binary was overwritten.

The Solution: Instant Replay

DiffKeeper uses eBPF to capture filesystem writes at line-rate and stores them in Pebble. Then it gives you a timeline so you never guess timestamps again.

1) Record a Session

Wrap your flaky test (or any command). Minimal overhead.

diffkeeper record --state-dir=/tmp/trace -- go test ./...

2) See the Timeline (no blindfolds)

List every write in order to pick the exact second to rewind:

diffkeeper timeline --state-dir=/tmp/trace
[00m:01s] WRITE    status.log (13B)
[00m:05s] WRITE    db.lock (6B)
[02m:14s] WRITE    status.log (22B)   <-- the failure

3) Export the Crash Site

Restore the filesystem to the moment of failure:

diffkeeper export --state-dir=/tmp/trace --out=./debug_fs --time="2m14s"

cd ./debug_fs and inspect files exactly as they existed at that moment.

Drop-in GitHub Action

No curl | sh snippets needed—use the composite action directly:

steps:
  - uses: actions/checkout@v4
  - name: Record flaky test
    uses: saworbit/diffkeeper@v1
    with:
      command: go test ./...
      state-dir: diffkeeper-trace

On failure the trace uploads as an artifact; you can run diffkeeper timeline to find the culprit write, then diffkeeper export to reconstruct it locally.

The "Flaky CI" Demo

Run the built-in demo to see the loop end-to-end:

diffkeeper record --state-dir=./trace -- go run ./demo/flaky-ci-test
diffkeeper timeline --state-dir=./trace
diffkeeper export --state-dir=./trace --out=./restored --time="2s"
cat ./restored/status.log  # ERROR: Connection Lost

Architecture

Engine: Pure Go + eBPF (CO-RE)
Storage: Pebble (LSM) for high-speed ingestion.
Diffing: bsdiff (binary patches) for efficient storage.

CI / Dogfooding

GitHub Actions (.github/workflows/ci.yml) runs unit/race tests, cross-platform builds, and a functional time-machine test that records a flaky script and verifies exports.
BoltDB-era workflows remain archived under docs/archive/v1-legacy/workflows/.

Requirements & Compatibility

Build Process: The recommended way to build the project is with Docker, which requires no local dependencies. Simply run make build-dockerized. For local builds, you will need Go, clang, and bpftool.
Runtime Privileges: The core recording feature requires sudo privileges on Linux to attach the eBPF probes to the kernel. The application will provide a clear error if run without them.
Cross-Platform Support: The high-performance eBPF monitoring is Linux-specific. The tool provides a fallback for macOS and Windows, but its behavior and performance will differ.

Getting Started

See the Quickstart to record, view the timeline, and export your first trace.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DiffKeeper: The Kubernetes Time Machine

The Problem: "Why did that test fail?"

The Solution: Instant Replay

1) Record a Session

2) See the Timeline (no blindfolds)

3) Export the Crash Site

Drop-in GitHub Action

The "Flaky CI" Demo

Architecture

CI / Dogfooding

Requirements & Compatibility

Getting Started

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

DiffKeeper: The Kubernetes Time Machine

The Problem: "Why did that test fail?"

The Solution: Instant Replay

1) Record a Session

2) See the Timeline (no blindfolds)

3) Export the Crash Site

Drop-in GitHub Action

The "Flaky CI" Demo

Architecture

CI / Dogfooding

Requirements & Compatibility

Getting Started