Skip to content

Provider middleware: retry with exponential backoff #41

@koko1123

Description

@koko1123

Motivation

Production bots (liquidation keepers, arbitrage searchers) make continuous RPC calls and must handle:

  • Transient RPC failures (rate limits, timeouts, 503s)
  • Network blips

Without retry logic, a single failed eth_getLogs call in a liquidation keeper can cause a missed opportunity or position monitoring gap. This is table stakes for any bot running in production.

API

New file: src/middleware.zig (or extension of provider.zig)

pub const RetryOpts = struct {
    /// Maximum number of attempts (1 = no retry). Default: 3.
    max_attempts: u32 = 3,
    /// Initial backoff delay in milliseconds. Default: 100.
    initial_backoff_ms: u64 = 100,
    /// Backoff multiplier. Default: 2.0 (exponential).
    backoff_multiplier: f64 = 2.0,
    /// Maximum backoff delay in milliseconds. Default: 5_000.
    max_backoff_ms: u64 = 5_000,
    /// Jitter factor 0.0–1.0 (adds randomness to prevent thundering herd). Default: 0.1.
    jitter: f64 = 0.1,
    /// Which errors should trigger a retry. Default: connection/timeout errors only.
    retryable: RetryableErrors = .connection_errors,

    pub const RetryableErrors = enum {
        connection_errors,   // only network errors
        all_rpc_errors,      // includes rate limits (429) and server errors (5xx)
    };
};

/// Wraps a Provider and retries failed calls according to RetryOpts.
pub const RetryingProvider = struct {
    inner: *Provider,
    opts: RetryOpts,

    pub fn init(inner: *Provider, opts: RetryOpts) RetryingProvider
    // Forwards all Provider methods with retry wrapping
};

Behaviour

  • On retryable error: wait backoff * jitter_noise, then retry
  • On non-retryable error (e.g. invalid params, revert): fail immediately
  • Log retry attempts (optional callback)
  • Return the final error after max_attempts exhausted

Why middleware pattern

Keeping retry logic as a wrapper (not baked into transport) means:

  • The base Provider stays simple and testable
  • Bots can opt in to retry without changing their existing code
  • Retry behaviour is composable (retry + caching in future)

Relationship

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions