Skip to content

feat(flashblocks rpc): Full revamp of X Layer custom flashblocks RPC layer#176

Open
sieniven wants to merge 72 commits intomainfrom
niven/cleanup-flashblocks-rpc
Open

feat(flashblocks rpc): Full revamp of X Layer custom flashblocks RPC layer#176
sieniven wants to merge 72 commits intomainfrom
niven/cleanup-flashblocks-rpc

Conversation

@sieniven
Copy link
Copy Markdown
Contributor

@sieniven sieniven commented Mar 10, 2026

Summary

This PR revamps the entire flashblocks rpc layer design, similar to the previous Realtime-feature (RTA) cache capabilities on the since discontinued xlayer-erigon (refer to: https://github.com/okx/xlayer-erigon/tree/dev/zk/realtime).

Essentially, this revamp implements the full solution 2 mentioned in the upstream reth's issue discussion thread: paradigmxyz/reth#20491, which is a more elegant design to the complex op-reth's flashblock rpc layer that is tightly coupled to the underlying anchoring canonical chainstate.

Motivations

The best design for allowing minimal latencies is to allow a full state accumulation from incoming flashblocks that are ahead of the canonical chainstate, while allowing for a hybrid sync approach that:

  1. State accumulated from incoming flashblocks that are ahead of the canonical chainstate are stored entirely in memory, while maintaining the underlying canonical chainstate as an anchor.
  2. Optimize default CL sync from the state accumulation through a pre-warming mechanism of already executed flashblocks sequence (skip payload re-validation entirely)
  3. Optimize flashblocks sequence validation with current Reth's payload validator optimizations. This includes
    • State root strategies (sparse trie caching, parallel merklization, and default synchronous SR resolution)
    • Prewarming states mechanism
    • Asynchronous resolution of transaction root, receipt root and log bloom on block assembling
  4. Support processing flashblocks sequence block access list for faster state executions
  5. Leverage on current reth engine's changeset cache, for asynchronous deferred computing of trie data
  6. Automatic flushing of flashblocks state layer and in-memory trie changesets (up till CHANGESET_CACHE_RETENTION_BLOCKS), once canonical chainstate catches up
  7. State queries (pending and confirmed states, also supporting latest tags) on flashblocks rpc tries to read from both the flashblocks state cache and underlying canon provider
  8. State queries handle edge cases automatically such as chain re-orgs, flashblocks re-orgs, and flashblocks state layer lagging behind canonical chainstate

New flashblocks rpc design

1. Flashblock cache layer

  • Adds the new FlashblockStateCache layer which includes both the pending state and confirm state caches
  • Confirm state cache indexes and stores full block data (state and stateless) in moemory
  • Pending state cache stores the current flashblocks sequence state received from the sequencer
  • On cache misses on the flashblocks state cache, all state queries are proxied to the underlying canon provider
  • To handle edge case where canon provider is ahead of flashblocks state cache and cache reorg has not yet been triggered, if state retrieval hits on both underlying canon provider and flashblocks cache, we prioritize canon provider as the source of truth
  • Holds the flashblocks state layer's trie data changeset cache, used for sparse trie task computation for accelerated flashblocks sequence pending state computation

2. Revamp execution environment for flashblocks sequence validation

The flashblocks RPC layer revamp contains significant optimizations and alignment to the upstream reth's payload validator logic. The old logic was vastly over-complicated, and the revamped version now encapsulates the flashblocks sequence validation inside the new FlashblockSequenceValidator. It aligns the sequence validation logic to the highly optimized upstream reth's engine payload validator.
Ref: https://github.com/paradigmxyz/reth/blob/main/crates/engine/tree/src/tree/payload_validator.rs

Below are some of the key optimizations:

  1. Change logic to by default always calculate state root. Merklization on payload validation has been optimized greatly on reth's v1.11 release onwards with the new sparse trie implementation (feat: sparse trie as cache paradigmxyz/reth#21583) and is no longer a bottleneck here
  2. Sequence validator uses the upstream reth's payload processor which encapsulates the merklization optimization and payload validation logic. This includes:
    • State root strategies (sparse trie caching, parallel merklization, and default synchronous SR resolution)
    • Prewarming states mechanism
    • Support processing flashblock access list
    • Asynchronous resolution of transaction root, receipt root and log bloom on block assembling
  3. Sequence validator supports both fresh builds (on new pending heights) and incremental builds for flashblock sequence deltas (building on top of the current pending state). This includes:
    • Support re-using cache reads across incremental builds
    • Merging suffix execution results with cached prefix execution results
    • Pre-validation of incoming flashblocks sequence
    • Executing transactions on top of in-memory state overlay providers, with pending bundle pre-state updated

3. Flashblocks eth rpc extension

  • Adds the new flashblocks-eth rpc extension (and maintaining compatibility with upstream OpEthApi)
  • Override all eth json-rpc APIs to support using the new flashblocks state cache layer and the underlying canon chainstate provider
  • Adds support for previously missing flashblocks eth APIs:
    • eth_sendRawTransactionSync
    • eth_getRawTransactionByBlockHashAndIndex
    • eth_getRawTransactionByBlockNumberAndIndex

4. Remove sequence manager, revamp to RawCache

  • Removes the sequence manager entirely and its overly complicated updating logic from the execution task completion thread (updating calculated SR near target flashblocks count and re-broadcasting the same sequence)
  • Simplifying to use a raw cache, which accumulates the raw flashblocks received by the sequencer
  • Raw cache maintains similar logic to get the next best build-args, while tracking the last fully executed flashblocks sequence index

5. Add handlers for flashblocks state update logic

  • Add handle_flashblocks handler - handle receiving a raw flashblock, and updating the raw cache
  • Add handle_flashblocks_state handler - handle spawning task execution for flashblocks sequence validation and updating the flashblocks state cache
  • Add handle_canonical_block - logic to handle canonical chainstate updates to flush flashblocks state cache layer, and also handle extreme cases like chain reorgs, etc

Supported flashblocks eth APIs

Block apis
[✅] eth_blockNumber
[✅] eth_getBlockByNumber
[✅] eth_getBlockByHash
[✅] eth_getBlockReceipts
[✅] eth_getBlockTransactionCountByNumber
[✅] eth_getBlockTransactionCountByHash

Transaction apis
[✅] eth_getTransactionByHash
[✅] eth_getRawTransactionByHash
[✅] eth_getTransactionReceipt
[✅] eth_getTransactionByBlockHashAndIndex
[✅] eth_getTransactionByBlockNumberAndIndex
[✅] eth_getRawTransactionByBlockHashAndIndex
[✅] eth_getRawTransactionByBlockNumberAndIndex
[✅] eth_sendRawTransactionSync

State apis
[✅] eth_call
[✅] eth_estimateGas
[✅] eth_getBalance
[✅] eth_getTransactionCount
[✅] eth_getCode
[✅] eth_getStorageAt

TODO:
Support eth_getLogs in the future

Design flaws on op-reth flashblocks rpc

  1. The flashblocks rpc layer is tightly coupled to the canonical chainstate height. Thus, although they have since added optimistic builds to advance the pending state, however confirmed chainstates (using "latest" tags) are still dependent on the canonical chainstate
  2. Flashblocks consensus sync mode does not make sense since flashblocks finality is less probably than default blocks. Flashblocks consensus sync also cannot currently account for rollbacks
  3. Lack of support or design consideration for hybrid sync approach - tightly coupled with reason 2. Using a hybrid sync approach (CL sync + flashblocks state updates) is ideal for flashblocks design. However, full payloads from CL driven sync are currently re-executed on the default payload validator on engine_newPayload (even though the flashblocks sequence had already executed and validated the exact same payload)
  4. SR calculations are triggered based on configurable values instead of a more deterministic approach, which can be a point of failure on rpc consensus. Also with reth's sparse trie async task feature implemented on reth v1.11 onwards, incremental SR calculations on payload validation becomes much cheaper
  5. No flashblocks BAL implemented

On subsequent reth version upgrades (todo):

  1. Add perf(tasks): add LazyHandle<T>, use for hash post state paradigmxyz/reth#22347 fixes once reth version is upgraded
  2. Add perf: share executed tx counter with prewarming paradigmxyz/reth#22647 optimizations to share tx index
  3. refactor(engine): PayloadProcessor::spawn_state_root paradigmxyz/reth#22604

Todos:

  1. Add support for flashblocks block access list for faster flashblocks sequence execution and validation

sieniven and others added 16 commits March 9, 2026 15:49
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…to state cache

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a tertiary tx hash index (HashMap<TxHash, Arc<CachedTxInfo>>) to
ConfirmCache, enabling O(1) lookups for transaction_by_hash,
transaction_by_hash_with_meta, and receipt_by_hash from confirmed
flashblocks that are ahead of the canonical chain.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add collect_cached_block_range_while helper that threads a predicate
through both the provider prefix and cached suffix, enabling
sealed_headers_while to serve confirmed flashblock headers instead
of bypassing the cache entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… rpc crate

Consolidates flashblocks cache provider trait implementations (block,
header, factory, receipt, transaction) into a single eth.rs override
module in the rpc crate. Wires FlashblockStateCache into the node's
RPC registration. Simplifies the cache layer to only manage state.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…an up cache interfaces

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sieniven and others added 8 commits March 12, 2026 21:31
Consolidate the previously split flashblocks service setup and state
cache creation into one cohesive `if let Some` block. This fixes a
subtle bug where using `.map()` with `?` operators would produce
`Option<Result<T, E>>` instead of the intended `Option<T>`, causing
`?` to not propagate errors to the outer closure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… access

Restructure ConfirmCache to store ExecutedBlock directly instead of just
BlockAndReceipts, enabling state provider overlay for flashblock state queries.

- Add ConfirmedBlock struct wrapping ExecutedBlock + receipts
- Update insert/remove signatures to accept ExecutedBlock
- Make PendingSequence.tx_index pub for state access
- Fix execution cache test assertions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cks state

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sieniven sieniven force-pushed the niven/cleanup-flashblocks-rpc branch from f8a26bf to 5d96010 Compare March 13, 2026 10:15
@sieniven sieniven force-pushed the niven/cleanup-flashblocks-rpc branch from 5d96010 to 6666f70 Compare March 13, 2026 10:19
sieniven and others added 2 commits March 13, 2026 19:41
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sieniven and others added 4 commits March 25, 2026 10:17
Consolidate early-return flush logic into a single return path, prefer
newest raw cache entry via reverse iteration, fix typo in log message,
and inline format args for clippy compliance.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… build validation

Thread payload_id from raw cache build args into PrefixExecutionMeta so
the validator can verify that incremental builds belong to the same
payload sequence. Also fix test assertions for rejected flashblocks and
tighten prevalidate_incoming_sequence to detect payload_id mismatches.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove unused ConfirmCache methods (hash_for_number, remove_block_by_*,
len/is_empty in non-test), move block_from_bar into test-only scope,
drop unused type param on handle_execution_tasks, add Default impl for
FlashblockStateCache, and use idiomatic is_none_or / contains_key.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sieniven
Copy link
Copy Markdown
Contributor Author

Claude Review — Resolution Status

The automated review identified 13 issues. Here is the resolution status for each:


Resolved (All 13/13)

# Severity Issue Resolution
1 Critical Stale hash_to_number / tx_index on duplicate height in ConfirmCache::insert() Fixed in 7c0a691. insert() now removes the old entry via blocks.remove(&height) and cleans up hash_to_number + tx_index before inserting the new block. Test renamed from ..._leaks_stale_hash_index to ..._cleans_stale_indexes and assertions updated.
2 Critical Receipt [idx] panic in commit_pending_sequence Fixed in b0a8862. Added debug_assert_eq! documenting the EVM invariant that receipts.len() == transactions.len(). Runtime .get() was deemed unnecessary since this is a core EVM guarantee.
3 Significant process_flashblock incomplete — raw cache not integrated Resolved in 35f1509. Entire pipeline rewritten: spawn_rpc() now wires handle_incoming_flashblocksRawFlashblocksCacheExecutionTaskQueuehandle_execution_tasks → validator.
4 Significant DEFAULT_TX_CACHE_SIZE = 10M with with_capacity Fixed in 7c0a691. DEFAULT_CONFIRM_BLOCK_CACHE_SIZE reduced from 1,000 to 50. Effective tx cache capacity: 50 × 10,000 = 500K (down from 10M).
5 Significant blocking_recv() in potentially async context Resolvedhandle_execution_tasks is spawned via spawn_critical_blocking_task (service.rs:233), which runs on a dedicated OS thread. Safe by design.
6 Significant Read-path flush() on error in get_state_provider_by_id Documented in 7c0a691. Added comments at all 3 flush sites explaining the flush is intentional — the overlay is non-contiguous, indicating stale state from a concurrent canonical commit. Flushing is self-correcting as subsequent execution tasks rebuild fresh state from canonical parent.
7 Minor Magic Duration::from_secs(1) undocumented Documented in 7c0a691. Comment added: "Set pending block deadline to 1 second matching default blocktime."
8 Minor eth_blockNumber override may confuse clients Documented in 7c0a691. Added doc comment on the trait method explaining it returns max(flashblocks_confirm_height, canonical_height) and that non-overridden methods (e.g., eth_getLogs) only see canonical state.
9 Minor prevalidate_incoming_sequence accepts stale flashblocks below canon height Fixed in 7c0a691. Added explicit rejection: incoming_block_number <= canon_height now returns Err("stale height: ...").
10 Minor get_best_revision not connected to BuildArgs Resolved in 35f1509. try_get_buildable_args() now calls try_get_best_revision() internally and pipes it into try_to_buildable_args() to construct BuildArgs.
11 Minor RawFlashblocksCache &mut self unnecessarily restrictive Resolved in 35f1509. Outer methods now take &self and acquire the inner RwLock appropriately.
12 Minor assemble_flashblock Ecotone blob gas diverges from upstream Verified — identical to upstream OpBlockAssembler in deps/optimism/rust/op-reth/crates/evm/src/build.rs:82-91. Both hardcode (Some(0), Some(0)) for Ecotone. No fix needed.
13 Minor block_from_bar dead code Resolved in 89ee622. Moved entirely inside #[cfg(test)] module in utils.rs.

Additional fixes from cleanup session (commits a74a10c, b0a8862, 89ee622)

  • Bug fix: handle_canonical_block early return was skipping canon_info and confirm_height updates after flush. Fixed by removing early return and using shared fall-through path.
  • Bug fix: try_get_buildable_args iterated oldest-first (returning stale entry on sequencer reorg). Fixed by adding .rev() to iterate newest-first.
  • Feature: Added payload_id to BuildArgs and PrefixExecutionMeta for incremental build validation — ensures incoming sequence matches the pending payload ID.
  • Cleanup: Removed unused methods from ConfirmCache (hash_for_number, remove_block_by_number, remove_block_by_hash), gated test-only utilities with #[cfg(test)], fixed all clippy warnings.

All 68 flashblocks tests pass. just check-clippy clean.

@sieniven
Copy link
Copy Markdown
Contributor Author

@claude review again for me. The handles logic have been added and the PR is completed and ready for review. Also ensure that all previously mentioned issues have been resolved.

cc/ @JimmyShi22

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

Claude finished @sieniven's task in 5m 31s —— View job


PR Review: feat(flashblocks rpc): Full revamp of X Layer custom flashblocks RPC layer

Tasks:

  • Read PR context and understand the scope
  • Verify all 13 previously identified issues are resolved
  • Review new handles/pipeline integration code (state.rs)
  • Review service.rs wiring
  • Review cache modules for correctness
  • Review execution/validator.rs
  • Review rpc layer (flashblocks.rs, default.rs, helper.rs)
  • Compile and summarize findings

Verification: Previously Identified Issues

All 13 previously flagged issues are resolved as described. The key confirmations:

# Status Notes
1 ✅ Fixed ConfirmCache::insert now removes old hash+tx entries before inserting. See new issue below.
2 ✅ Documented debug_assert_eq! documents the EVM invariant. Direct index at line 418 is still there but acceptable for a core EVM guarantee.
3 ✅ Wired Full pipeline in service.rs:spawn_rpchandle_incoming_flashblocksRawFlashblocksCachehandle_execution_tasks → validator.
4 ✅ Fixed DEFAULT_CONFIRM_BLOCK_CACHE_SIZE = 50, effective tx cache capacity 500K.
5 ✅ Safe handle_execution_tasks spawned via spawn_critical_blocking_task (service.rs:233).
6 ✅ Documented Flush-on-error comments at all 3 sites explain the rationale (non-contiguous overlay = stale state, self-healing).
7 ✅ Documented Comment added: "Set pending block deadline to 1 second matching default blocktime."
8 ✅ Documented Doc comment on eth_blockNumber explains the max(fb, canon) behavior and the caveat for non-overridden methods.
9 ✅ Fixed incoming_block_number <= canon_height now returns Err("stale height: ...") at validator.rs:480.
10 ✅ Connected try_get_buildable_args calls try_get_best_revision() internally via try_to_buildable_args() (raw.rs:208).
11 ✅ Fixed Outer methods now take &self and acquire the inner RwLock (raw.rs:34-47).
12 ✅ Verified (Some(0), Some(0)) for Ecotone matches upstream OpBlockAssembler.
13 ✅ Fixed block_from_bar moved into #[cfg(test)] in utils.rs.

New Issues Found in This Review

Critical: ConfirmCache::insert — wrong order introduces silent tx_index corruption

📄 crates/flashblocks/src/cache/confirm.rs:87-112

The fix to issue #1 introduced a new correctness bug. In insert(), the new block's tx_index entries are inserted before the old block's tx_index entries are removed. This means that if the old block and the new block at the same height share any transaction hashes, remove_tx_index_for_block silently deletes the newly-inserted entries:

// Step 1: Insert new tx entries for new block (lines 90-102)
for (idx, (tx, receipt)) in txs.iter().zip(receipts.as_ref().iter()).enumerate() {
    self.tx_index.insert(tx_hash, CachedTxInfo { ... }); // ← new entry written
}

// Step 2: Remove old block, which calls remove_tx_index_for_block on old block (lines 104-108)
if let Some((old_hash, old_block)) = self.blocks.remove(&height) {
    self.hash_to_number.remove(&old_hash);
    self.remove_tx_index_for_block(&old_block); // ← can delete entries inserted in Step 1!
}

Scenario: Flashblock reorg at height H — old sequence and new sequence both include the same user transaction (same hash). After insert: the transaction is missing from tx_index entirely. eth_getTransactionByHash returns None even though the tx is in the new confirmed block.

Fix: swap the order — remove the old block's tx_index entries before inserting the new block's entries:

// First: clean up old block at this height
if let Some((old_hash, old_block)) = self.blocks.remove(&height) {
    self.hash_to_number.remove(&old_hash);
    self.remove_tx_index_for_block(&old_block);
}
// Then: insert new entries
for (idx, (tx, receipt)) in txs.iter().zip(receipts.as_ref().iter()).enumerate() {
    self.tx_index.insert(tx_hash, CachedTxInfo { ... });
}

Note: The existing test test_confirm_cache_insert_duplicate_height_cleans_stale_indexes uses empty_receipts() (no transactions) so it doesn't catch this. A test with overlapping tx hashes is needed.

Fix this →


Significant: ConfirmCache::insert capacity check blocks legitimate height replacement

📄 crates/flashblocks/src/cache/confirm.rs:81-85

if self.blocks.len() >= DEFAULT_CONFIRM_BLOCK_CACHE_SIZE {
    return Err(eyre!("confirm cache at max capacity..."));
}

This check fires before the duplicate-height replacement path. If the cache is at capacity (50 blocks) and a reorg arrives for a height already in the cache, the operation fails instead of replacing. The slot is occupied — it isn't a new insertion — so the capacity guard is wrong here:

// Correct: only block genuinely new heights when at capacity
if self.blocks.len() >= DEFAULT_CONFIRM_BLOCK_CACHE_SIZE && !self.blocks.contains_key(&height) {
    return Err(eyre!("confirm cache at max capacity..."));
}

Fix this →


Minor: spawn_prewarm uses PayloadId::default() for all prewarmed payloads

📄 crates/flashblocks/src/service.rs:287

let payload = OpBuiltPayload::<OpPrimitives>::new(
    reth_payload_builder::PayloadId::default(), // ← all-zeros
    ...
);

All prewarmed payloads are registered with the same all-zeros payload ID. If the CL engine queries engine_getPayloadV2 with the actual payload ID, it won't match this prewarmed entry and the prewarming benefit may be lost. The actual payload ID from the pending sequence (pending_sequence.prefix_execution_meta.payload_id) should be used here if the intent is for the CL to retrieve the prewarmed payload.


New Pipeline Code Assessment (state.rs, service.rs)

The completed wiring in spawn_rpc and state.rs is well-structured:

  • Task separation is correct: handle_incoming_flashblocks is async (tokio task), handle_execution_tasks is blocking (dedicated OS thread via spawn_critical_blocking_task), handle_canonical_stream is async. This cleanly separates IO-bound from CPU-bound work.
  • Batch draining in handle_incoming_flashblocks (lines 52-75) correctly uses now_or_never() to drain buffered flashblocks before notifying the executor — avoids redundant wakeups.
  • Queue eviction logic (state.rs:112-121): evicts lowest height when full. This is correct — if the executor is behind, stale lower heights should give way to newer heights.
  • condvar.wait_while(guard, |q| q.is_empty()) correctly blocks only when empty; when heights remain in the queue after processing one, the thread loops immediately without waiting. No missed wakeup issue.
  • handle_canonical_stream correctly calls raw_cache.handle_canonical_heightflashblocks_state.handle_canonical_blocktask_queue.flush() in that order, ensuring consistent state across all three components.

Positive Observations

  • The spawn_prewarm design elegantly reuses the confirmed sequence for pre-heating the engine tree execution cache via the existing payload_events_sender infrastructure.
  • prevalidate_incoming_sequence correctly handles all four cases: incremental build (same height, same payload_id), fresh build (height+1), stale (≤ canon), and too-far-ahead (> canon+1).
  • merge_suffix_results correctly re-uses suffix requests only (not prefix requests), which is the right call since post-execution requests are produced from complete block state.
  • spawn_deferred_trie_task correctly wraps the compute in catch_unwind(AssertUnwindSafe(...)) so a panic in the background task does not bring down the execution thread.
  • await_state_root_with_timeout racing loop is correct — polls both channels in a tight 10ms loop and returns whichever wins.
  • FlashblocksEthApiExt::send_raw_transaction_sync correctly uses biased; to prefer flashblock stream hits over canonical notifications (flashblocks have lower latency), and correctly handles canonical stream None (stream ended) without panicking.

Summary

The PR is in a good state. The three previously-WIP items (execution pipeline wiring, raw cache integration, and handler logic) are now implemented. The pipeline architecture is sound. Two correctness issues were found:

  1. The tx_index insertion order in ConfirmCache::insert is wrong (critical — introduced by the fix to the previous issue feat(build): Use just as command runner #1). Must be fixed before merge.
  2. Capacity check prevents legitimate height replacement (significant — can cause unexpected errors during a reorg at full cache capacity).

The spawn_prewarm PayloadId::default() issue is minor but worth addressing for correctness.

- Reorder insert() to remove old block entries before inserting new tx
  index entries, preventing silent tx_index corruption when old and new
  blocks share transaction hashes
- Allow height replacement at full capacity by checking contains_key
  before rejecting inserts
- Add test for overlapping tx hash scenario on duplicate height insert
- Document PayloadId::default() usage in spawn_prewarm

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sieniven
Copy link
Copy Markdown
Contributor Author

Second Claude Review — Resolution Status

The second review (triggered after all 13 original issues were resolved) found 3 new issues. All are now resolved in 455fcaa:

# Severity Issue Resolution
1 Critical ConfirmCache::insert — tx_index entries inserted before old block cleanup, causing silent deletion of shared tx hashes Fixed. Reordered: old block removal (blocks.remove, hash_to_number.remove, remove_tx_index_for_block) now happens before new tx_index entries are inserted. Added test_confirm_cache_insert_duplicate_height_retains_shared_tx_entries to cover the overlapping tx hash scenario.
2 Significant Capacity check blocks legitimate height replacement when cache is full Fixed. Added && !self.blocks.contains_key(&height) to the capacity guard, allowing replacements at existing heights even at max capacity.
3 Minor spawn_prewarm uses PayloadId::default() for all prewarmed payloads Kept as default with added comment. Since this is an RPC node (engine_getPayload is never called), using PayloadId::default() is intentional — avoids accumulating stale entries in the engine state tree payload job map.

All 69 flashblocks tests pass (68 existing + 1 new). cargo clippy clean.

@sieniven
Copy link
Copy Markdown
Contributor Author

@claude review again for me

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

Claude finished @sieniven's task in 4m 26s —— View job


PR Review: feat(flashblocks rpc): Full revamp of X Layer custom flashblocks RPC layer

Tasks:

  • Read PR context and understand scope
  • Verify all previously fixed issues (from reviews 1 & 2) are correctly resolved
  • Review cache/confirm.rs — ConfirmCache correctness
  • Review cache/mod.rs — FlashblockStateCache
  • Review cache/raw.rs — RawFlashblocksCache and pipeline wiring
  • Review execution/validator.rs — FlashblockSequenceValidator
  • Review execution/assemble.rs — Block assembly
  • Review service.rs and state.rs — Pipeline and handlers
  • Review rpc/src/flashblocks.rs — RPC extension
  • Review rpc/src/helper.rs and rpc/src/default.rs
  • Compile and post findings

Verification: All Previously Identified Issues

All issues from both prior reviews are correctly resolved. Spot-checked below.

# Prior Issue Verified
1 (R1) Stale hash index in ConfirmCache::insert ✅ Old block removed before new entries inserted (line 88–112)
2 (R1) Receipt [idx] panic debug_assert_eq! preserves EVM invariant; direct index retained
3 (R1) process_flashblock unconnected ✅ Full pipeline wired in service.rs:spawn_rpc
4 (R1) DEFAULT_TX_CACHE_SIZE = 10M ✅ Cache reduced to 50 blocks, effective 500K tx entries
5 (R1) blocking_recv() in async context ✅ Spawned via spawn_critical_blocking_task (service.rs:233)
6 (R1) Read-path flush on error ✅ All 3 flush sites now documented with rationale
7 (R1) Magic +Duration::from_secs(1) ✅ Comment added at validator.rs:423
8 (R1) eth_blockNumber caveat ✅ Doc comment on trait method
9 (R1) Stale flashblocks below canon accepted incoming_block_number <= canon_height now returns Err(...)
10 (R1) get_best_revision disconnected try_get_buildable_args now calls try_to_buildable_args()
11 (R1) &mut self unnecessarily restrictive ✅ Outer methods now take &self
12 (R1) Ecotone blob gas ✅ Matches upstream OpBlockAssembler
13 (R1) block_from_bar dead code ✅ Moved inside #[cfg(test)]
C1 (R2) tx_index insert-before-remove ordering ✅ Old block now removed before new tx entries are inserted (confirm.rs:88–112)
C2 (R2) Capacity check blocks height replacement ✅ Guard now && !self.blocks.contains_key(&height) (confirm.rs:81)
C3 (R2) PayloadId::default() in spawn_prewarm ✅ Intentionally kept with comment — this is an RPC-only node

New Issues Found

Minor: withdrawals_at does not accumulate across flashblock indices

📄 crates/flashblocks/src/cache/raw.rs:197–199, 208–216

transactions_up_to(n) collects transactions from all indices 0..=n cumulatively. But withdrawals_at(n) only returns the withdrawals from the single flashblock at index n:

fn withdrawals_at(&self, index: u64) -> Vec<Withdrawal> {
    self.payloads.get(&index).map(|p| p.diff.withdrawals.clone()).unwrap_or_default()
}

If the sequencer distributes withdrawals across multiple flashblock diffs (e.g., some in index 0, more in index 1), using withdrawals_at(best_revision) would silently drop all withdrawals from earlier indices. This is only safe if the protocol guarantees that: (a) all withdrawals are in the final flashblock of the sequence, or (b) each flashblock's diff.withdrawals is the cumulative list (not incremental). Neither assumption is validated in code or documented. Worth explicitly verifying against the OP Stack flashblocks spec. If withdrawals are incremental, this needs the same accumulation pattern as transactions_up_to.

Verify/fix if needed →


Nit: .expect("valid cached tx") in RPC hot path

📄 crates/rpc/src/helper.rs:114

converter.fill(info.tx.clone().try_into_recovered().expect("valid cached tx"), tx_info)

try_into_recovered() can fail if the signer cannot be recovered from the stored transaction. Since CachedTxInfo::tx is stored without the signer being embedded in the type, a recovery failure here would panic inside a JSON-RPC server async task and crash it. The invariant comment is correct in expectation, but an ? propagation would be safer for production code.

Fix this →


Code Quality Observations

The completed pipeline architecture reads cleanly. Specific observations:

  • handle_canonical_block fix is correct: After the early-return-removal fix (a74a10c), both the flush and evict paths now unconditionally update canon_info and confirm_height (mod.rs:424–425). ✅
  • state.rs task separation: Async IO-bound tasks on tokio, blocking CPU-bound execution on a dedicated OS thread via spawn_critical_blocking_task. Correct. ✅
  • condvar wakeup loop: After processing one height, the loop immediately re-checks q.is_empty() without waiting — correct for multi-height queue draining. ✅
  • Reorg flow: handle_canonical_stream calls raw_cache.handle_canonical_heightflashblocks_state.handle_canonical_blocktask_queue.flush() in that order, keeping all three components in sync. ✅
  • prevalidate_incoming_sequence covers all four cases: fresh build (height+1), incremental (same height, same payload_id), stale (≤ canon), too far ahead (> canon+1). ✅
  • spawn_deferred_trie_task panic safety: catch_unwind(AssertUnwindSafe(...)) protects the execution thread from a panic in the background trie computation. ✅
  • Batch drain in handle_incoming_flashblocks: now_or_never() loop correctly drains buffered flashblocks before a single notify_one() — avoids redundant wakeups. ✅

Summary

This is the third review pass. All 15 issues from reviews 1 and 2 are correctly resolved, including the critical ordering fix and the capacity guard fix. The only new items are a minor protocol-behavior question on withdrawals_at and a small panic surface in the RPC helper. The pipeline architecture, execution logic, and cache correctness are all solid. The PR is in good shape for merge pending the withdrawals accumulation verification.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sieniven
Copy link
Copy Markdown
Contributor Author

Third Claude Review — Resolution Status

The third review confirmed all 15 prior issues resolved and found 2 new minor items. Both addressed in 455fcaa and 50e4841:

# Severity Issue Resolution
1 Minor withdrawals_at does not accumulate across flashblock indices Not a bug. Per the OP Stack flashblocks spec: "All fields in this structure represent the cumulative state of the entire block up to and including the current flashblock, not just the changes from this specific flashblock." Each diff's withdrawals is the complete list — no accumulation needed. Additionally, on OP Stack post-merge (Isthmus), withdrawals is always [] with only withdrawals_root set. Added a clarifying comment in 50e4841.
2 Nit .expect("valid cached tx") in RPC hot path Kept as .expect() with expanded comment (50e4841). The error type (OpTxEnvelope) cannot be converted to Rpc::Error via ?, and the invariant is guaranteed — the tx was originally recovered from a valid ExecutedBlock. Added comment documenting the safety invariant.

All 69 flashblocks tests pass. cargo clippy clean. All 17 review issues across 3 rounds are now resolved.

sieniven and others added 7 commits March 25, 2026 16:56
…for execution

Reject incoming flashblocks when canonical height is still zero (not yet
initialized) to prevent premature validation. Switch the execution task
from spawn_critical_blocking_task to a dedicated OS thread via
reth_tasks::spawn_os_thread since it runs a blocking loop.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Preserve canonical height continuity after a state cache flush instead
of resetting confirm_height to zero, which could cause stale validation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tate

Thread CanonicalInMemoryState into FlashblockStateCache so that
get_overlay_data collects executed blocks from both the flashblocks
cache and the engine's in-memory canonical chain, anchoring the state
provider at the on-disk persistence boundary instead of the canon hash.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nces to confirm cache

Split the promotion logic in handle_pending_sequence to handle target_index
completion and next-block arrival as separate cases. When target_index is
reached for the current block, the complete sequence is promoted directly to
the confirm cache and pending is cleared, preventing stale pending detection
when the canonical block arrives. Also update validator prevalidation to use
confirm_height instead of canon_height for flashblock height checks.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the cumulative gas offset application from merge_suffix_results
into the per-receipt execution loop so that receipts sent to the
incremental root builder already carry the correct cumulative_gas_used.
Refactor FlashblockReceipt::add_cumulative_gas_offset to operate on a
single receipt instead of a slice. Also improve logging consistency and
add target_index to execution trace.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…vider

For incremental builds, resolve the state provider from the pending
sequence block hash instead of the parent hash so the overlay trie
anchors at the latest executed prefix. Thread the parent SealedHeader
into PendingSequence so the EVM env and state root remain consistent
across incremental steps. Pass evm_env directly to execute_block to
avoid cloning the full ExecutionEnv.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sieniven and others added 8 commits March 27, 2026 17:15
…ilds

Incremental builds accumulate one revert entry per flashblock index via
merge_transitions, but the engine persistence service expects a single
revert per block. Flatten all revert transitions by keeping the earliest
(parent-state) account info per address and merging storage slot reverts
across transitions, preserving wipe_storage from later indices.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… engine

Add a debug mode (--xlayer.flashblocks-debug-state-comparison) that
compares ExecutedBlock bundle states, reverts, and trie data between
the flashblocks RPC cache and the engine canonical state on each new
canonical block. Also add a flag to disable pre-warming so the engine
computes payloads independently for accurate comparison. The heavy
comparison runs on a blocking thread to avoid stalling the canonical
stream handler.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expose a unified lookup across pending and confirm caches for
ExecutedBlock by block number, simplifying the debug comparison to a
single call instead of inline pending sequence matching.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drop DeferredTrieData comparison (hashed_state + trie_updates) from the
debug state comparison since trie data is computed independently by the
engine and not meaningful for execution output equivalence checks.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants