Skip to content

Data race and goroutine leak in replication request handler #3

@MastaP

Description

@MastaP

Summary

partition/recovery.go:131-172 — Each incoming LedgerReplicationRequest spawns an untracked goroutine that:

  1. Reads n.blockStore concurrently with the main loop goroutine that writes to it (via storeBlock/deleteBlock in finalizeBlock). There is no synchronization — this is undefined behavior under Go's memory model.
  2. Has no concurrency limit — a malicious peer can send a flood of replication requests to spawn unbounded goroutines, causing resource exhaustion.
  3. Cannot be cancelled — the goroutines are not tracked and don't check context cancellation during the DB iteration loop.

Severity

Critical — data race (UB) + remotely exploitable goroutine/memory exhaustion.

Suggested Fix

  • Process replication requests on the main event loop, or add proper synchronization around blockStore access.
  • Add a concurrency limiter (e.g., a semaphore) to bound concurrent replication handlers.
  • Track spawned goroutines and cancel them on shutdown.

Related

  • No rate limiting or deduplication of replication requests (the UUID in the request is unused for dedup).
  • Replication responses are accepted from any peer, not just the one the request was sent to (recovery.go:178-234). Combined with nil shardConfHash validation in block_processor.go:21, an attacker could feed blocks with a different shard configuration.
  • Replication and handshake messages lack cryptographic authentication — IsValid() only checks non-empty fields.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions