Skip to content

Zero-alloc drain path for ring buffer hot loop #34

@ImpulseB23

Description

@ImpulseB23

Follow-up to PRI-6. The current drain path allocates per tick:

  • RingBufReader::drain() returns Vec<Vec<u8>> (one Vec per message plus the outer vec)
  • host::parse_batch allocates a Vec<UnifiedMessage>
  • serde_json::from_slice allocates Strings for every owned field in the deserialized types

At the documented 10k–20k msg/sec peak targets from docs/performance.md this will likely show up in profiles. Current implementation is correct and simple enough to ship for the first end-to-end milestone; optimization should be driven by actual benchmarks, not speculation.

Suggested direction (when benchmarks show it's needed)

  1. Replace RingBufReader::drain() -> Vec<Vec<u8>> with a callback-style API:

    pub fn drain_with<F: FnMut(&[u8])>(&mut self, f: F)

    so the caller sees each message's bytes as a slice directly into the mapped region, with no per-message allocation.

  2. Keep a caller-owned Vec<UnifiedMessage> scratch buffer that is cleared and refilled each tick instead of allocated fresh. Already trivial to add to the drain loop in lib.rs.

  3. Investigate simd-json or sonic-rs for zero-copy parsing. Both are drop-in replacements that return owned types but with faster allocation paths.

  4. If profiling shows allocation of the UnifiedMessage fields is the hot path (not likely — these are small strings), consider Cow<'a, str> or SmartString fields.

Acceptance

  • Benchmark the current drain path at 20k msg/sec synthetic load (needs a benchmark harness, which is its own small task).
  • Only then optimize. Do not land this speculatively.

Source

Flagged by Copilot review on PR #33.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions