Skip to content

Implement leanSpec devnet-4 (Dual-Key Validators, Recursive Aggregation)#209

Open
dimka90 wants to merge 36 commits intomainfrom
devnet-4
Open

Implement leanSpec devnet-4 (Dual-Key Validators, Recursive Aggregation)#209
dimka90 wants to merge 36 commits intomainfrom
devnet-4

Conversation

@dimka90
Copy link
Copy Markdown
Collaborator

@dimka90 dimka90 commented Apr 12, 2026

Summary

This PR implements the full set of leanSpec devnet-4 changes.

  • Bumps LEAN_SPEC_COMMIT_HASH from devnet-3 (be8531) → 16e50a5
  • Integrates 87 spec commits, including:
    • Dual-key validator architecture
    • Block envelope redesign
    • Recursive XMSS aggregation
    • Block building rewrite
    • Fork-choice convergence fixes

Devnet validation:

  • 3-node devnet with 5 validators
  • Justification and finalization advancing
  • Blocks include attestations
  • Recursive aggregation producing valid proofs across nodes

Scope of Changes

1. Dual-Key Validators

  • Validator split into:
    • attestation_pubkey
    • proposal_pubkey
  • Key management updated:
    • Dual XMSS keypairs per validator
    • YAML format updated (annotated_validators.yaml)
  • Signing rules:
    • Proposer → proposal key
    • Attesters → attestation key
  • SSZ updated (validator size now 104 bytes)

2. Block Structure Redesign

  • Removed:
    • BlockWithAttestation
    • SignedBlockWithAttestation
  • Introduced:
    • SignedBlock (flat structure)
  • Proposer now signs:
    • hash_tree_root(block)
  • Removed proposer-attestation processing flow

3. Store & Attestation Handling

  • GossipSignaturesAttestationSignatures
  • Enforced:
    • Unique AttestationData per block
    • MAX_ATTESTATIONS_DATA = 16
  • Non-aggregators:
    • Validate and drop raw gossip signatures (spec-compliant)

4. Block Building Rewrite

  • Switched from per-validator → per-AttestationData aggregation
  • Introduced:
    • Fixed-point loop with greedy proof selection
    • Trial STF to detect justification advancement
  • Enforced attestation cap during build
  • Correct genesis parent-root derivation

5. Recursive XMSS Aggregation

  • New pipeline:
    • Select → Fill → Aggregate
  • Supports recursive proofs via:
    • AggregateWithChildren (FFI)
  • Added:
    • ChildProof type (Go FFI)
    • Children proof arrays (Rust multisig backend)

Convergence & Consensus Fixes

Tick Ordering Fix

  • updateHead now runs before proposal
  • Pending blocks drained before attestation production

Impact:

  • Ensures all nodes agree on the same head before attesting
  • Eliminates divergent target/source roots

Fork-Choice Convergence

  • Deterministic tiebreak:
    • checkpointSupersedes with root comparison
  • Uses:
    • store justified (not head-state justified)

Note:

  • Matches leanSpec PR #595 (spec is reverting to this behavior)

Aggregation Pipeline Improvements

  • Drain gossip attestations before aggregation
  • Skip self-verification for aggregator
  • Performance optimizations:
    • DFT twiddle precomputation (~10x proving speedup)
    • AVX2 SIMD hashing
    • Prover pre-initialization
  • Payload buffers:
    • Unbounded (pruned only on finalization)

Networking Updates

  • Subnet model:
    • Per-validator subnet subscriptions
    • --aggregate-subnet-ids flag now functional
  • Aggregator fallback:
    • Subnet 0
  • Non-aggregators:
    • Validate-and-drop raw gossip signatures

Spec Test Updates

  • Fixture parsing:
    • Supports dual pubkeys (with legacy fallback)
    • Supports new SignedBlock format
  • Fork-choice test runner:
    • New step types:
      • attestation
      • gossipAggregatedAttestation
      • tick
  • Correct attestation weighting in fork choice
  • Time progression supported in tests

Spec Compliance

  • Updated:

    • LEAN_SPEC_COMMIT_HASH = 16e50a5fcc8be837f09aabf30c92e653bc36dad4
  • Test status:

    • ✅ 91 / 96 passing
    • ⚠️ 5 failing (signature verification fixtures)
  • Pending:

    • Regenerate fixtures with:
      --scheme=prod
      
  • Known deviation:

    • Uses store justified + tiebreak
    • Aligns with upcoming spec change (leanSpec PR #595)

Testing

Local

  • go build ./... → clean build
  • go test ./... → unit tests passing
  • go test -tags spectests ./spectests/...
    • 91 passing
    • 5 pending (signature fixtures)

Devnet

  • 3-node network verified:
    • Justification and finalization progressing
    • Blocks propagate correctly
    • All nodes converge on same head

Aggregation

  • Recursive proofs:
    • Verified across nodes

Interop

  • Ready for:
    • devnet-4 shared multi-client testnet

Notes / Follow-ups

  • Regenerate signature fixtures with --scheme=prod
  • Monitor leanSpec PR #595 merge (spec alignment)
  • Validate behavior in multi-client environments at scale

Checklist

  • Spec updated to devnet-4
  • Dual-key validators implemented
  • Block envelope migrated
  • Recursive aggregation integrated
  • Fork-choice convergence fixed
  • Devnet validated
  • Signature fixtures regenerated (--scheme=prod)

dimka90 added 21 commits April 11, 2026 18:32
…DATA

Split Validator.Pubkey into AttestationPubkey + ProposalPubkey (SSZ 60→112
bytes) per leanSpec PR #449. Both keys populated from same genesis key;
Phase 2 will parse separate keys from GenesisValidatorEntry.

Add MAX_ATTESTATIONS_DATA=16 constant per leanSpec PR #536.

All .Pubkey references updated to .AttestationPubkey. Phase 3 will correct
proposal-context call sites to use ProposalPubkey.

Makefile: make clean no longer deletes *_encoding.go (sszgen not installed).

Verified against leanSpec validator.py: field order, types, and SSZ size match.
…ONS_DATA

- TestValidatorDualKeysIndependent: verifies attestation and proposal keys
  are stored independently through SSZ roundtrip
- TestMaxAttestationsDataConstant: verifies constant equals 16
- TestValidators: verifies genesis populates both keys and they match
  (same genesis key used for both until Phase 2)
Split KeyManager into attestationKeys + proposalKeys maps per leanSpec PR #449.
SignAttestation routes through attestation key, SignBlock through proposal key.
Each key advances its OTS state independently.

Genesis config now parses GenesisValidatorEntry with attestation_pubkey +
proposal_pubkey from YAML (spec: lean_spec/subspecs/genesis/config.py).

Keygen generates two XMSS keypairs per validator with separate seed phrases,
producing validator_N_attestation_sk.ssz and validator_N_proposal_sk.ssz files.

Cross-ref: ethlambda ValidatorKeyPair, zeam ValidatorKeys.

Tests:
- TestKeyManagerDualKeyRouting: verifies attestation/proposal key routing,
  signature correctness, and cross-verification failure
- TestKeyManagerSignErrors: verifies error cases for unknown validators
- Genesis tests updated for dual-key YAML format
Proposer now signs with ProposalPubkey, attesters with AttestationPubkey.

- maybePropose: signs proposer attestation with GetProposalKey() instead
  of SignAttestation() (which uses attestation key)
- verifyBlockSignatures: verifies proposer signature against ProposalPubkey
  instead of AttestationPubkey
- Signed message remains attestation data root for now; Phase 4 will
  change to hash_tree_root(block)

All four signing/verification paths confirmed:
  Proposer:     signs with proposal key, verifies with ProposalPubkey
  Attester:     signs with attestation key (SignAttestation → attestationKeys)
  Gossip verify: uses AttestationPubkey
  Block verify:  body attestations use AttestationPubkey
…igning

Remove BlockWithAttestation and rename SignedBlockWithAttestation → SignedBlock
per leanSpec PR #449. Proposer now signs hash_tree_root(block) with proposal
key. ProposerAttestation removed entirely — proposer attests at interval 1
like all validators.

Changes across 15 files:
- types/block.go: BlockWithAttestation removed, SignedBlock holds *Block directly
- types/block_encoding.go: BlockWithAttestation SSZ methods removed
- node/store_block.go: ProcessProposerAttestation deleted, verifyBlockSignatures
  verifies proposer sig over block root with ProposalPubkey
- node/validator.go: maybePropose signs block root, produceAttestations removes
  proposer skip, adds self-delivery for aggregator (ethlambda pattern)
- node/block.go, node/node.go, node/consensus_store.go: type updates
- p2p/gossip.go, p2p/publish.go, p2p/peers.go, p2p/reqresp.go: type updates
- spectests/, xmss/: test fixtures updated
- Makefile: sszgen target updated for SignedBlock
- node/store_errors.go: dead ErrProposerAttestationMismatch removed

Cross-ref: leanSpec block.py, ethlambda devnet4-phase4-network, zeam devnet4
…/cap enforcement

Rename GossipSignature* → AttestationSignature* per leanSpec store.py:
- GossipSignatureEntry → AttestationSignatureEntry
- GossipDataEntry → AttestationDataEntry
- GossipSignatureMap → AttestationSignatureMap
- GossipDeleteKey → AttestationDeleteKey
- ConsensusStore.GossipSignatures → AttestationSignatures
- Prometheus metric: lean_gossip_signatures → lean_attestation_signatures

on_block now enforces per leanSpec store.py lines 549-556:
- Each AttestationData appears at most once per block (DuplicateAttestationData)
- Distinct count ≤ MAX_ATTESTATIONS_DATA=16 (TooManyAttestationData)

Cross-ref: zeam chain.zig DuplicateAttestationData/TooManyAttestationData,
ethlambda store.rs DuplicateAttestationData, leanSpec store.py on_block
… building

Rewrite buildBlock from per-validator latest-vote to per-AttestationData
fixed-point algorithm per leanSpec state.py build_block.

- Sort payloads by target.slot for deterministic order
- Iterate per-AttestationData (not per-validator) with source==current_justified
- selectBestProof: pick single best proof per data (max validator coverage)
- MAX_ATTESTATIONS_DATA=16 cap enforced on build side
- Trial STF detects justification advance, continues with new source
- Genesis edge case: parent_root substitution for zero-hash checkpoint

Pre-Phase-7: one proof per AttestationData (no duplicates). Phase 7 will add
recursive children aggregation for multi-proof compaction.

Old per-validator helpers removed: selectLatestPerValidator,
groupValidatorsByEntry, emitAttestationsForGroup.

Cross-ref: leanSpec state.py:630-780, zeam getProposalAttestationsUnlocked,
ethlambda build_block
Implement three-phase Select/Fill/Aggregate per leanSpec store.py:936-1071.

Rust FFI (xmss/rust/multisig-glue):
- Updated to leanMultisig rev fd8814045 with recursive xmss_aggregate API
- xmss_aggregate now takes raw sigs + children proofs + log_inv_rate
- xmss_verify_aggregated takes proof bytes directly (not opaque pointer)

Go FFI (xmss/ffi.go):
- ChildProof type for passing pre-aggregated proofs
- AggregateWithChildren() wraps recursive FFI
- AggregateSignatures() backward-compatible (no children)
- VerifyAggregatedSignature updated for new verify API

Store aggregate (node/store_aggregate.go):
- Phase 1 Select: greedily pick child proofs from new+known pools
- Phase 2 Fill: collect raw gossip sigs for uncovered validators
- Phase 3 Aggregate: recursive proof via AggregateWithChildren
- Gossip entries fully consumed per-dataRoot after aggregation
- Minimum evidence gate: ≥1 raw sig OR ≥2 children

Cross-ref: leanSpec aggregation.py, zeam forkchoice.zig aggregateUnlocked,
zeam multisig-glue xmss_aggregate with children
…utation

Update hashsig-glue to leansig devnet4 branch (Dim46Base8, 2536-byte sigs).
Add precompute_dft_twiddles to multisig-glue prover setup — required by the
sumcheck backend for polynomial evaluation tables. SignatureSize 3112→2536,
SSZ offsets updated across attestation and block encoding.
Add runtime.Pinner to AggregateWithChildren and VerifyAggregated to pin
Go slices containing pointers before passing to C. Go 1.21+ cgo checks
reject unpinned Go pointers in C arguments.
Move updateHead before maybePropose at interval 0 per leanSpec get_proposal_head.
Add PromoteNewToKnown + updateHead in maybePropose for freshest head before building.
Use headState.LatestJustified for both attester and builder source filter matching
ethlambda and leanSpec. Delegate interval-2 aggregation to Engine with drain of
pending gossip attestations. Add minimum 2 inputs check for prover.
Update fixture parser for camelCase JSON (attestationPubkey, proposalPubkey).
Update signatures test for flat sigSBA struct and signedBlock JSON key.
Remove ProposerAttestation from forkchoice test per leanSpec PR #449.
Update ffi_test for Dim46Base8 signature size.
Non-aggregator nodes now return immediately in onGossipAttestation
without XMSS verification (~500ms saved per attestation). Per leanSpec
store.py:385-386 only aggregators store gossip signatures. Non-aggregators
receive aggregated proofs via the aggregation gossip topic instead.
VoteTracker.AppliedIndex was not remapped when Prune shifted node indices,
causing phantom weight inflation and fork-choice divergence across nodes.
Add RemapIndices to VoteStore that adjusts AppliedIndex, LatestKnown.Index,
and LatestNew.Index after pruning — votes pointing to pruned nodes are
invalidated. Matches zeam forkchoice.zig rebase (lines 760-796).
Enable AVX2 SIMD via -Ctarget-cpu=haswell in Makefile — brings aggregation
from 2-3s to ~450ms (under 800ms interval budget). Refactor aggregation to
snapshot-and-ship pattern for future async support. Fix unconsumed gossip
sigs being lost during snapshot swap — sigs that can't aggregate (< 2 inputs)
are returned to the store for the next round. Add drainAggResults in
maybePropose for freshest payloads before building.
Only pass -Ctarget-cpu=haswell on x86_64 machines. ARM builds (Apple M1,
Graviton) skip the flag and use native instruction set. Prevents build
failure on non-x86 platforms.
Remove snapshot-and-ship pattern (TakeAggregationSnapshot, AggregateFromSnapshot,
AggResultCh, drainAggResults, runAggregation, applyAggregationResult) that
introduced source instability. Revert to direct store access matching gean main.
Remove unused checkpointSupersedes tiebreaker. Remove atomic.Bool guard.
…oads

Drain all queued blocks from BlockCh before producing attestations at
interval 1. Go's select randomly picks between tick and block channels,
causing attestations with stale head when a block sits unprocessed.
drainPendingBlocks + updateHead ensures all nodes converge on the same
head before attesting, eliminating cross-node source root divergence.

Remove FIFO capacity cap on KnownPayloads and NewPayloads (now unbounded,
pruned on finalization only). Matches zeam's approach — prevents fresh
attestations from being evicted before the builder can include them,
which caused justification stalls after ~500 slots.
Both attester and builder use store.LatestJustified for source checkpoint.
Add checkpointSupersedes with deterministic root tiebreak at same slot —
all nodes converge to the same justified root regardless of block
processing order. Remove verbose SKIP source mismatch debug logs.
Subscribe to attestation subnets based on validator assignments instead
of blanket all-subnet subscription. Aggregators subscribe to validator-
derived subnets plus explicit --aggregate-subnet-ids. Non-aggregators
subscribe only to their validators' subnets. Fallback to subnet 0 if
no subnets derived. Parse --aggregate-subnet-ids CLI flag.

Replace cross-references to other client implementations with leanSpec
references throughout the codebase.
@dimka90 dimka90 linked an issue Apr 12, 2026 that may be closed by this pull request
35 tasks
@dimka90 dimka90 requested review from mananuf and shaaibu7 April 12, 2026 22:26
dimka90 added 2 commits April 13, 2026 07:14
…leanup

- Add attestation and gossipAggregatedAttestation step types to FC test runner
- Flat block format (no fcStepBlock wrapper) matching devnet-4 fixtures
- Block body attestation participants feed fork choice weight correctly
- Time advancement for block/attestation steps
- Export AggregationBitsFromIndices for test use
- gofmt all flagged files (consensus_store, store_aggregate, store_build,
  signatures_test, keys)
dimka90 added 3 commits April 13, 2026 08:34
Process gossip attestations in dedicated goroutines instead of the
sequential event loop. Each incoming attestation gets its own goroutine
for XMSS verification (~500ms each), matching zeam's inline processing
model where attestations are verified as they arrive from libp2p.

Previously, attestations queued in a channel and the main event loop
processed them one-at-a-time between ticks. With 3 attestations from
other nodes at ~500ms each, only 2-3 of 5 were ready by interval 2.
The aggregator produced proofs covering 2-3 validators — below the
4/5 supermajority threshold, causing justification to stall.

Changes:
- AttestationSignatureMap: plain map -> mutex-protected struct
- Snapshot() method for lock-free iteration during aggregation
- Gossip attestations processed via goroutine pool, not event loop
- Remove drainPendingAttestations (no longer needed)
Parse both lean-quickstart format (two entries per validator with
pubkey_hex + privkey_file, attester/proposer inferred from filename)
and gean keygen format (one entry with dual key fields).

This enables gean to load keys from lean-quickstart's multi-client
devnet config without requiring format conversion.
dimka90 and others added 10 commits April 13, 2026 17:42
Pin rec_aggregation, leansig_wrapper, and backend to 2dc7867 (devnet4
branch tip). This pulls in the batch AIR sumcheck restructuring (#191)
and the updated cached_bytecode.bin required to interop with ethlambda's
compat/annotated-validators-quickstart-format branch. Cargo.lock updated
to match; Plonky3 transitive deps refreshed b9a9069a -> c7bacaeb.
Dockerfile: add CARGO_ENCODED_RUSTFLAGS=-Ctarget-cpu=haswell to the Rust
FFI build, matching the Makefile and enabling AVX2 SIMD in leanMultisig's
backend crate. Yields roughly 4-6x prover speedup on aggregate proof
generation with no portability cost (haswell is the 2013+ baseline,
identical in feature set to zeam's x86_64-v3).

Makefile: update docker-build target to tag ghcr.io/geanlabs/gean:devnet4
instead of :devnet3 to reflect the current devnet.
Pulls in PR #194 (Devnet4 fix-avx512), which rolls up:
- #192 poseidon avx2 avx512: new SIMD implementations for Poseidon hashing
- #193 fix avx512 (panicked on small instances): AVX512 code path bug fix

No cached_bytecode.bin or rec_aggregation/ changes, so proof format is
identical to 2dc7867 and interop with ethlambda compat branch is
preserved. Pure perf + stability update.
Adds /debug/pprof/ endpoints (index, cmdline, profile, symbol, trace) to
the metrics HTTP server's mux. Enables heap, goroutine, CPU, block, and
mutex profiling on demand via the existing --metrics-port without opening
a new port or adding runtime overhead when idle.

Primary use case: diagnosing finality stalls. When chain advances but
justification does not, a goroutine dump (debug=2) shows which goroutine
is blocked and where, turning a multi-hour mystery into a 30-second
diagnosis. Heap dump separates memory-growth stalls from deadlocks.

Follows the standard Go convention used by go-ethereum, prysm, and
erigon.
Introduces DecodeENR returning structured ENRFields{Seq, PeerID, Multiaddr}
for the networking codec spec test harness. Tolerates ENRs without an ip
field (returns empty Multiaddr) and produces a transport-only multiaddr
without the /p2p/<peerID> suffix, matching leanSpec's fixture expectations.

ParseENR is untouched; bootnode dialing continues to use it unchanged.
This is a purely additive helper for spec compliance testing — no callers
in production invoke DecodeENR.
Adds TestSpecNetworkingCodec under the spectests build tag. Walks
leanSpec/fixtures/consensus/networking_codec/devnet/networking, parses
each JSON fixture, and dispatches by codecName to gean's equivalent
primitive. Reports per-codec pass/fail/skip counts.

Codecs wired in (from leanSpec PR #608):
  - varint: 13/13 passing (gean's EncodeVarint matches LEB128 spec)
  - gossip_message_id: 8/8 passing (SHA256 formula matches spec)
  - enr: 3/3 passing (uses the new DecodeENR helper)
  - gossip_topic: 0/8 — known ecosystem-wide divergence: gean,
    ethlambda, and zeam all use a fixed network name (devnet0) while
    the fixtures expect the hex forkDigest. Intentionally not 'fixed'
    on gean alone to preserve live multi-client mesh compatibility.

No production code is called; all covered primitives are tested in
isolation.
fix(Dockerfile): adds proper support for arm64(MacOs) when building i…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Port gean to devnet-4

2 participants