Skip to content

[Feature] Fallback Model Chain & Routing Strategies for @yourgpt/llm-sdk #76

@Rohitjoshi9023

Description

@Rohitjoshi9023

🚀 Feature Request

Problem

Provider outages, rate limits, and timeouts cause complete request failures with no recovery. There's no built-in way to fall through to a backup model or distribute load across providers — leaving all resilience logic to the application developer to wire manually.


💡 Proposed Solution

Two focused additions to @yourgpt/llm-sdk:

  1. Fallback Chain — auto-retry with the next provider on failure
  2. Routing Strategies — priority (default) and round-robin

Fallback Chain

When the primary model fails, the SDK automatically retries with the next model in the fallbacks list — transparently, without any change to the calling code.

Triggers fallback on:

  • 5xx server errors
  • Network timeouts
  • Provider unavailability
  • 429 rate limit errors

Does not trigger on:

  • 4xx client errors (bad request, invalid API key) — these are caller bugs, not provider failures

An onFallback callback fires on each failed attempt, exposing the attempted model, next model, error message, and attempt number — for logging and observability.

A FallbackExhaustedError is thrown when all models in the chain fail, with a per-model breakdown of what failed and why.


Routing Strategies

Instead of always trying the primary model first, a routing strategy determines which model in the pool to call first.

Strategy Description
priority Try models in defined order. Default.
round-robin Rotate starting model evenly across calls

Routing Store (Pluggable State for Strategies)

Strategies like round-robin need to track state (e.g. which model was last used) to work correctly across multiple calls. By default the SDK uses an in-memory store — works out of the box for single-process apps but resets on restart and does not share state across instances.

For production multi-instance or serverless deployments, users can plug in their own store via a simple get/set interface — no specific client is mandated.

  • Built-in: MemoryRoutingStore (default, zero config, zero deps)
  • Bring your own: Any store that implements the interface — Redis, Upstash, Cloudflare KV, DynamoDB, or anything else. The SDK ships the interface, the user owns the implementation.

This keeps the SDK lightweight and deployment-agnostic — no Redis client is ever bundled or required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions