Skip to content

Event 80 · CP-CHAIN-RECOVERY-PROTOCOL-01 first slice · v1.1 cycle opens#37

Merged
junjslee merged 1 commit intomasterfrom
event-80-chain-recovery-protocol
Apr 29, 2026
Merged

Event 80 · CP-CHAIN-RECOVERY-PROTOCOL-01 first slice · v1.1 cycle opens#37
junjslee merged 1 commit intomasterfrom
event-80-chain-recovery-protocol

Conversation

@junjslee
Copy link
Copy Markdown
Owner

Summary

v1.1 architectural cycle opens. First CP shipped per ~/episteme-private/docs/cp-v1.1-architectural.md sequencing — CP-CHAIN-RECOVERY-PROTOCOL-01 (foundational; prereq for Cognitive Arm A / CP-TEMPORAL-INTEGRITY-EXPANSION-01).

Background gap. Pillar 2 (append-only hash chain) is tamper-evident — verify_chain reports the first break-index. But chains break legitimately: disk corruption, schema migration, accidental directory deletion, multi-machine fork, post-fact tamper detection. Until this Event the kernel had a detection contract but no recovery contract. verify reports a break — then what?

What ships

1. kernel/CHAIN_RECOVERY_PROTOCOL.md (NEW, ~250 lines, public-tier)

Canonical doc enumerating:

  • 5 critical-gap scenarios — disk corruption mid-stream, schema migration across kernel versions, accidental directory deletion / fresh-start, multi-machine fork, adversarial tamper caught post-fact.
  • 3 recovery modes — reset (full rewind), selective (windowed-rebuild), migrate (schema forward-walk).
  • Recovery-attestation envelope schema — every recovery is a chain-anchored event; the genesis of the recovered chain IS the attestation.
  • Threat model — what's closed vs what's deferred to v1.2+ federation work (tail truncation, coordinated FS rewrite, multi-machine merge adjudication).
  • CLI usage examples + post-recovery verification steps.

2. episteme chain recover CLI subcommand

episteme chain recover --mode={reset,selective,migrate} \
    --stream <name> --reason "..." --confirm "..." \
    [--what-was-lost "..."]
Mode Status Behavior
reset ✅ functional Full rewind. Wraps existing reset_stream (CP7) with the new attestation envelope.
selective 🔒 stub (exit 2) Returns named-dependency error pointing at CP-CHAIN-RECOVERY-PROTOCOL-01 Component 5 (windowed-rebuild algorithm).
migrate 🔒 stub (exit 2) Returns named-dependency error pointing at CP-TEMPORAL-INTEGRITY-EXPANSION-01 (Cognitive Arm A's supersede-with-history infrastructure).

Stubs ship in this Event so the API surface is visible + stable; operators see the mode names + their dependencies before the implementations land. API forward-stability rather than absent-by-design.

3. Recovery-attestation envelope schema

Genesis record of any recovered chain now carries:

{
  "type":                  "chain_reset" | "chain_recovery_selective" | "chain_recovery_migrate",
  "mode":                  "reset" | "selective" | "migrate",
  "reason":                "<operator rationale>",
  "operator_confirmation": "<operator confirmation phrase>",
  "previous_head":         "sha256:<hex>" | null,
  "recovered_at":          "<ISO-8601 UTC>",
  "archived_from":         "<absolute path to archived prior chain>" | null,
  "what_was_lost":         "<operator description of lost data>" | null
}

core/hooks/_chain.py:reset_stream extended (backward-compatible) to populate the new fields. Default mode='reset' preserves pre-Event-80 caller behavior.

Tests

tests/test_chain_recover.py (NEW) — 5/5 pass:

  • reset_genesis_payload_has_all_attestation_fields — verifies all 8 documented fields are emitted
  • reset_genesis_archived_from_null_when_no_prior_chain — recovery on a virgin stream
  • reset_what_was_lost_optional — field is null when not provided
  • reset_default_mode_is_reset — backward compatibility for pre-Event-80 callers
  • reset_archives_prior_chain_and_creates_new_intact_chain — end-to-end: prior entries + recovery + new chain intact + archived file readable

Full test suite green: 158/158 (153 baseline + 5 new). Existing test_chain_and_framework.py covers backward compat — passes unchanged.

Live smoke tests verified pre-PR:

  • episteme chain recover --help renders all flags
  • --mode=selective returns exit 2 with named-dependency message pointing at Component 5
  • --mode=migrate returns exit 2 with named-dependency message pointing at CP-TEMPORAL-INTEGRITY-EXPANSION-01

What's deferred (named dependencies)

  • Component 5 (selective recovery — windowed-rebuild algorithm). Substantial algorithmic work; deserves a focused Event of its own. Stub ships now so the CLI surface is stable.
  • Component 4 (migrate — schema-migration forward-walk). Hard dependency on CP-TEMPORAL-INTEGRITY-EXPANSION-01 (Cognitive Arm A) — the migration walker needs the supersede-with-history infrastructure. Stub ships now so the CLI surface is stable.

Both deferrals are explicitly named in the spec doc + in the stub messages. Operators see the deferral structure rather than a hidden absence.

Threat-model gaps (out of scope; named for honesty)

Per kernel/CHAIN_RECOVERY_PROTOCOL.md § Threat model:

  • Tail truncation (erase-most-recent without external attestation) — needs chain-head commitment to git or witness server. v1.2+ federation work.
  • Coordinated FS rewrite (file + head atomic rewrite) — needs cryptographic signing of chain rotations. v1.2+ federation work.
  • Multi-machine merge adjudication (selective picks one fork; doesn't decide which "wins") — multi-machine sync protocol is v1.2+ federation concern.

The recovery-attestation envelope is operator-attestation grade, not cryptographic-signature grade. Operator's --confirm string is auditable evidence of intent; not a tamper-proof signature. Mitigation against forged attestations is the same v1.2+ federation work.

Soak-protected surfaces touched

Surface Status
kernel/CHAIN_RECOVERY_PROTOCOL.md NEW (public-tier under AGPL-3.0-or-later)
core/hooks/_chain.py Modified (post-soak; allowed; backward-compatible param additions)
src/episteme/cli.py Modified (CLI extension)
tests/test_chain_recover.py NEW
core/blueprints/* / src/episteme/_profile_audit*.py / templates/* / labs/* UNTOUCHED

v1.1 cycle queue post-Event-80

  • ✅ CP-FALSIFIABILITY-AUDIT-01 first slice (Event 73 / PR Event 73 · CP-FALSIFIABILITY-AUDIT-01 first slice #31)
  • CP-CHAIN-RECOVERY-PROTOCOL-01 first slice (this PR)
  • ⏳ CP-TEMPORAL-INTEGRITY-EXPANSION-01 + Cognitive Arm A
  • ⏳ CP-CONTEXT-AWARE-PROFILE-OVERRIDE-01 (depends on supersede infrastructure)
  • ⏳ CP-ACTIVE-GUIDANCE-RANKING-AUDIT-01 (parallel Arm B)
  • ⏳ CP-DESIGN-BEHAVIOR-VERIFICATION-01 (audit phase)
  • ⏳ CP-OPERATOR-COGNITIVE-BUDGET-01 (pairs with D11 in Arm A)
  • ⏳ CP-MODEL-PROGRESS-OBSOLESCENCE-01 (cross-cutting strategic)
  • ⏳ CP-PROJECT-GOVERNANCE-CONTINUITY-01 (cross-cutting governance)

Plus deferred components from this CP:

  • ⏳ CP-CHAIN-RECOVERY-PROTOCOL-01 Component 4 (migrate) — depends on CP-TEMPORAL-INTEGRITY-EXPANSION-01
  • ⏳ CP-CHAIN-RECOVERY-PROTOCOL-01 Component 5 (selective) — own focused Event

Cross-references

  • Spec source: ~/episteme-private/docs/cp-v1.1-architectural.md § CP-CHAIN-RECOVERY-PROTOCOL-01
  • Pillar 2 substrate: core/hooks/_chain.py cp7-chained-v1 envelope
  • Falsifiability claim it operationalizes: kernel/FALSIFIABILITY_CONDITIONS.md § A1 (Pillar 2 hash chain tamper-evidence)
  • Audit trail: ~/episteme-private/docs/PROGRESS.md Event 80 entry (private)

…_RECOVERY_PROTOCOL.md + episteme chain recover CLI (Event 80)

First v1.1 architectural cycle CP shipped. Foundational; prereq for
Cognitive Arm A (CP-TEMPORAL-INTEGRITY-EXPANSION-01).

Pillar 2 (append-only hash chain) is tamper-evident, but chains break
legitimately: disk corruption, schema migration, accidental directory
deletion, multi-machine fork, post-fact tamper detection. Until this
Event the kernel had a detection contract (verify_chain) but no
recovery contract — verify reports a break, then what?

What ships:

1. kernel/CHAIN_RECOVERY_PROTOCOL.md — canonical doc enumerating the
   5 critical-gap scenarios + 3 recovery modes (reset / selective /
   migrate) + recovery-attestation envelope schema + threat model +
   CLI usage examples.

2. episteme chain recover --mode={reset,selective,migrate} CLI
   subcommand. mode=reset is functional; selective + migrate are
   stubs that return exit code 2 with named-dependency error messages
   so the API surface is visible + stable but the deferred-by-design
   work is honestly named.

3. Recovery-attestation envelope schema. Genesis record of any
   recovered chain carries: type, mode, reason, operator_confirmation,
   previous_head, recovered_at, archived_from, what_was_lost.
   reset_stream extended (backward-compat) to populate the new fields;
   default mode='reset' preserves pre-Event-80 caller behavior.

Mode capabilities:
- reset (functional) — full rewind; archive prior chain; new genesis
  carries attestation envelope. Use for fresh-start recovery + post-
  tamper rewind. Wraps existing reset_stream from CP7.
- selective (stub; depends on Component 5) — partial-corruption
  windowed-rebuild. Identifies the last verifiably-good record;
  archives the corrupted suffix; writes attestation linking the two.
- migrate (stub; depends on CP-TEMPORAL-INTEGRITY-EXPANSION-01) —
  schema-migration forward-walk. v1.0-chain to v1.1-chain
  transformations preserved as supersede-with-history per Cognitive
  Arm A's temporal-integrity infrastructure.

Tests at tests/test_chain_recover.py — 5/5 pass; full suite 158/158.

Live smoke tests verified: --help renders, --mode=selective and
--mode=migrate stubs return exit 2 with named-dependency messages.

Threat-model gaps explicitly named in doc as v1.2+ federation work
(tail truncation, coordinated FS rewrite, multi-machine merge
adjudication).

Components 4 (migrate) + 5 (selective) deferred to follow-up Events
with named dependencies. CP-CHAIN-RECOVERY-PROTOCOL-01 first slice
(doc + reset + stubs) ships now; full implementation continues
through v1.1 cycle.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
episteme Ready Ready Preview, Comment Apr 29, 2026 6:00am

@junjslee junjslee merged commit 8074750 into master Apr 29, 2026
5 checks passed
@junjslee junjslee deleted the event-80-chain-recovery-protocol branch April 29, 2026 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant