Skip to content

perf: streaming reads for large event streams (PostgreSQL/SQLite) #364

@jwilger-ai-bot

Description

@jwilger-ai-bot

Problem

Both PostgreSQL (fetch_all()) and SQLite backends collect the entire event history into a Vec before returning from read_stream(). For large streams (10K+ events), this means:

  • Peak memory = entire stream deserialized in memory at once
  • All deserialization must complete before the first event can be processed
  • No opportunity for the consumer (e.g., state reconstruction fold) to process events incrementally

Proposed Solution

Return a streaming iterator or async stream from read_stream() that deserializes and yields events one at a time (or in small batches). The EventStreamReader type could wrap a stream internally while maintaining the current API for consumers that want to collect.

This would require changes to the EventStore trait's read_stream return type.

Expected Impact

  • Reduced peak memory for large streams
  • Same or slightly better latency (overlapping deserialization with processing)
  • Most impactful for projection catch-up on streams with thousands of events

Caveats

This is an API-level change to the EventStore trait that would affect all backend implementations. The current EventStreamReader with into_iter() may need to become a Stream or provide both collected and streaming modes.

Location

  • eventcore-types/src/store.rsEventStore::read_stream return type, EventStreamReader
  • All backend implementations

Benchmark Baseline

Run cargo bench -p eventcore-bench --bench store_operations -- 'store/read_stream' to measure before/after.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3-lowPolish, optimizationenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions