Skip to content

feat(seismic): add opt-in ECSD address screening#331

Open
aegis-cipherowl wants to merge 4 commits intoSeismicSystems:seismicfrom
cipherowl-ai:ecsd-address-screening
Open

feat(seismic): add opt-in ECSD address screening#331
aegis-cipherowl wants to merge 4 commits intoSeismicSystems:seismicfrom
cipherowl-ai:ecsd-address-screening

Conversation

@aegis-cipherowl
Copy link

Summary

Adds operator-enforced compliance screening for Seismic validators via integration with CipherOwl's ECSD (Ethereum Compliance Screening Daemon). This is an opt-in policy layer that screens transaction addresses against a compliance blocklist before pool admission.

  • Protocol vs Policy Separation: Screening is a separate ScreeningTransactionValidator wrapper, NOT part of protocol consensus rules
  • Type Transparency: Uses Either<A, B> to maintain pool type compatibility whether screening is enabled or disabled
  • Fail-Safe Modes: Configurable behavior (fail-open or fail-closed) when ECSD is unreachable
  • Production Performance: <1ms overhead per transaction (~470µs avg) with 100ms timeout providing 200x safety margin
  • Full Observability: Prometheus metrics for latency, throughput, errors, and address extraction

Key Features

Address Extraction

Extracts addresses from transactions for screening:

  • Transaction-level: sender, recipient, EIP-7702 authorizations, access lists
  • ERC-20: transfer, approve, transferFrom
  • ERC-721: safeTransferFrom variants
  • ERC-1155: safeTransferFrom, safeBatchTransferFrom

Performance: ~18ns per extraction (negligible CPU overhead)

gRPC Client

  • Persistent HTTP/2 connection with lazy initialization
  • Configurable timeout (default 100ms)
  • Fail modes: open (permissive) or closed (restrictive)
  • Builder pattern with validation

Performance: ~470µs avg latency with real ECSD Docker

CLI Configuration

seismic-reth node \
  --screening.enable \
  --screening.endpoint http://127.0.0.1:9090 \
  --screening.timeout-ms 100 \
  --screening.fail-mode open

Security Fix

  • Fail-mode validation: Invalid values (typos like close instead of closed) now fail fast at startup
  • No silent degradation: Previously unwrap_or(Open) could silently use permissive mode when operator intended restrictive mode

Architecture

TransactionTransactionValidationTaskExecutorEither<
    SeismicTransactionValidator,                    // No screening (default)
    ScreeningTransactionValidator<                  // With screening (opt-in)
        SeismicTransactionValidator
    >
>

Performance Benchmarks

Benchmarked against real ECSD Docker container:

Metric Value Notes
Calldata extraction ~18ns Negligible CPU overhead
ECSD gRPC latency ~470µs Per transaction (localhost)
Throughput ~2,100 tx/sec Sequential screening
Parallel (10 streams) ~20,000 tx/sec Estimated
Safety margin 200x 100ms timeout vs 470µs avg

Files Changed

New Files (15)

  • Proto & Build:

    • crates/seismic/txpool/proto/ecsd.proto - gRPC service definition
    • crates/seismic/txpool/build.rs - Proto compilation
  • Screening Module (6 files):

    • crates/seismic/txpool/src/screening/mod.rs
    • crates/seismic/txpool/src/screening/calldata.rs - Address extraction
    • crates/seismic/txpool/src/screening/client.rs - gRPC client
    • crates/seismic/txpool/src/screening/metrics.rs - Prometheus metrics
    • crates/seismic/txpool/src/screening/validator.rs - Validator wrapper
    • crates/seismic/txpool/src/screening/README.md - Documentation
  • Benchmarks (4 files):

    • crates/seismic/txpool/benches/calldata_extraction.rs
    • crates/seismic/txpool/benches/screening.rs
    • crates/seismic/txpool/benches/mock_ecsd.rs
    • crates/seismic/txpool/benches/README.md - Documentation
  • CLI Args:

    • crates/node/core/src/args/screening.rs - CLI configuration

Modified Files (8)

  • Cargo.toml, Cargo.lock - Added tonic, prost dependencies
  • bin/seismic-reth/src/main.rs - Wired up ScreeningArgs
  • crates/node/core/src/args/mod.rs - Exported ScreeningArgs
  • crates/seismic/node/Cargo.toml - Added futures-util
  • crates/seismic/node/src/node.rs - Conditional screening with Either
  • crates/seismic/txpool/Cargo.toml - Screening dependencies
  • crates/seismic/txpool/src/lib.rs - Updated pool type alias

Test Plan

  • Unit Tests: 27 tests passing (18 screening-specific + existing)

    • Calldata extraction for ERC-20, ERC-721, ERC-1155
    • Edge cases (empty, truncated, malformed data)
    • Client builder and configuration
    • Fail-open and fail-closed behavior
    • CLI validation for invalid fail-mode values
  • Benchmarks: All benchmarks passing with real ECSD Docker

    • Calldata extraction CPU benchmarks
    • gRPC latency benchmarks (2, 5, 10, 50 addresses)
    • Throughput benchmarks (100, 500 transactions)
    • Overhead comparison (extraction vs full screening)
  • Integration: Full compilation and type checking

    • Pool type compatibility with Either
    • CLI args parsing and validation
    • Node builder wiring
  • Manual Testing (post-merge):

    • Start ECSD Docker: docker run -p 9090:9090 cipherowl/ecsd:latest
    • Run validator with screening enabled
    • Submit transactions and verify screening metrics
    • Test fail-open behavior (stop ECSD, transactions should pass)
    • Test fail-closed behavior (stop ECSD, transactions should reject)
    • Verify invalid CLI args fail fast
  • Production Validation:

    • Deploy to testnet with screening enabled
    • Monitor Prometheus metrics for latency/throughput
    • Verify no impact on block building performance
    • Test ECSD failover scenarios

Documentation

  • ✅ Comprehensive README for screening module (src/screening/README.md)
  • ✅ Comprehensive README for benchmarks (benches/README.md)
  • ✅ Inline documentation for all public APIs
  • ✅ CLI help text for all screening flags
  • ✅ Performance analysis and capacity planning

Security Considerations

  1. Fail-Mode Validation: Added value_parser to reject invalid fail-mode values at startup (prevents silent misconfiguration)
  2. Separation of Concerns: Screening is operator policy, NOT consensus (other validators can have different configs)
  3. Fail-Safe Defaults: Default is disabled + fail-open (permissive)
  4. Timeout Safety: 100ms timeout with 200x margin over typical latency

Migration Impact

  • Zero breaking changes: Screening is opt-in, disabled by default
  • Backward compatible: Existing deployments unaffected
  • Type safe: Either maintains pool type compatibility

🤖 Generated with Claude Code

This commit introduces a new `ScreeningArgs` struct for configuring address screening in the Seismic node. The `SeismicNode` struct is updated to optionally include screening arguments, allowing for integration with an ECSD sidecar for address validation. The CLI is also modified to accept these new arguments, enhancing the node's capabilities for address screening during transaction processing. Additionally, various dependencies are updated in the Cargo files to support these changes.
aegis-cipherowl and others added 2 commits March 13, 2026 11:24
Resolves all CI check failures in PR SeismicSystems#331:

1. **rustfmt**: Reformatted .expect() call in node.rs to multi-line
   format per nightly rustfmt preferences

2. **protoc dependency**: Added protobuf-compiler installation to 5 CI
   jobs (warnings, clippy, unit-test, integration-test, viem) required
   by tonic-build for proto compilation

3. **clippy expect_used**: Added #[allow(clippy::expect_used)]
   annotations with justification:
   - calldata.rs: array length checked immediately before
   - node.rs: fail_mode validated by clap value_parser

All .expect() calls are justified and fail-fast is intentional.
Eliminates the protobuf-compiler (protoc) dependency by checking in the
generated gRPC client/server code, simplifying the build process and CI
configuration.

Changes:
- Add crates/seismic/txpool/src/screening/proto.rs (407 lines)
  Generated code from ecsd.proto with regeneration instructions
- Remove crates/seismic/txpool/build.rs
  No longer need tonic-build at compile time
- Remove [build-dependencies] from Cargo.toml
  Eliminates tonic-build = "0.12" dependency
- Update client.rs and mock_ecsd.rs
  Use super::proto instead of include_proto! macro
- Remove protoc installation from CI
  Deleted protoc setup from 5 jobs: warnings, clippy, unit-test,
  integration-test, viem

Benefits:
- No protoc installation required for developers or CI
- Faster builds (no proto compilation step)
- Simpler CI configuration (removed 10 lines across 5 jobs)
- Works on all platforms without protobuf-compiler package
- Generated code is version-controlled and reviewable

The proto source file remains at crates/seismic/txpool/proto/ecsd.proto
for reference. If it changes, proto.rs can be regenerated following the
instructions in its header comment.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link
Contributor

@daltoncoder daltoncoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Going to test some more on some of our testnets. A couple comments:

Result<tonic::Response<BatchCheckResponse>, tonic::Status>,
tokio::time::error::Elapsed,
> = {
let mut client = self.inner.client.lock().await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Mutex lock is held for duration of call. This could really effect how many transactions a second we can process. The underlying tonic client is clone and supports multiplexing so i think dropping the mutex and doing this here instead would increase our maximum throughput greatly.

let mut client = self.inner.client.clone();
tokio::time::timeout(self.inner.timeout, client.batch_check_addresses(request)).await

valid_tx.into_transaction(),
InvalidPoolTransactionError::Consensus(
InvalidTransactionError::SeismicTx(format!(
"address screening: flagged addresses {flagged:?}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this error will propagate back through rpc. So just wanted to make sure that is okay. If your blacklist is private this could allow people to submit transactions and figure out exactly which addresses are blacklisted. If thats the case we should probably just make this error super generic "txn rejected"

Fixes 43 clippy warnings across calldata.rs, client.rs, and proto.rs:

calldata.rs:
- Combine trait bounds: T: PoolTransaction + alloy_consensus::Transaction
- Add #[allow(clippy::indexing_slicing)] for length-checked slicing
  operations (all slices are validated before use)

client.rs:
- Make builder methods const: timeout() and fail_mode()

proto.rs (generated code):
- Add module-level #[allow] for generated code patterns:
  unreachable_pub, missing_const_for_fn, doc_markdown
- Add Eq to all PartialEq derives (ExtendedHealthRequest,
  ExtendedHealthResponse, BatchCheckRequest, BatchCheckResponse)
- Change visibility to pub (needed for benches to access)

benches/calldata_extraction.rs:
- Replace std::iter::repeat().take() with repeat_n() (clippy::manual_repeat_n)

screening.rs tests:
- Replace #[should_panic] with explicit error checking
  (panic message from clap doesn't match expected string exactly)

All changes maintain functionality while satisfying strict clippy lints
enabled in CI (-D warnings with custom lint configuration).
Copy link
Contributor

@cdrappi cdrappi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey guys. I stopped reviewing the whole thing after finding one super critical issue, detailed below

TLDR: we have some work to get our seismic tx pipeline in a spot where you can actually screen on encrypted calldata. right now, this code as executed on seismic transactions will not screen anything, defeating the purpose entirely

Right now, this is more of an us problem than a you problem; when we've got our ducks in line we will circle back with some instructions on correctly wiring in Seismic transactions to screening

} => {
// Extract all screenable addresses
let extraction_start = Instant::now();
let addresses = extract_addresses(valid_tx.transaction());
Copy link
Contributor

@cdrappi cdrappi Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if valid_tx is a seismic transaction here, we will fail to extract addresses on calldata: it'll run encrypted calldata through the screening. that means it will miss function selectors, address inputs, etc.

in our node, we currently decrypt in the block executor. in light of this audit discovery – independent but very related – we need to think of a plan that both:

  • prevents users from DOS'ing us with invalid ciphertext, and
  • allows us to screen the ciphertext itself, ideally in the mempool

You should be able to see that issue on the aegis-cipherowl account (invited this to Seismic org as a guest). If you'd like us to add another account, let me know and we can give you access to the repo with the issue

Also, since I've added aegis-cipherowl to our org, CI workflows should run automatically without one of us approving (at least after you accept the invite)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we have some work to do first to get our calldata encryption in the right spot; my main man @HenryMBaldwin is on this starting this afternoon. so for now, i'd hang tight (maybe aside from addressing @daltoncoder's comments)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants