[Feature] Fallback Model Chain & Routing Strategies for @yourgpt/llm-sdk

## 🚀 Feature Request

### Problem

Provider outages, rate limits, and timeouts cause complete request failures with no recovery. There's no built-in way to fall through to a backup model or distribute load across providers — leaving all resilience logic to the application developer to wire manually.

---

## 💡 Proposed Solution

Two focused additions to `@yourgpt/llm-sdk`:

1. **Fallback Chain** — auto-retry with the next provider on failure
2. **Routing Strategies** — priority (default) and round-robin

---

### Fallback Chain

When the primary model fails, the SDK automatically retries with the next model in the `fallbacks` list — transparently, without any change to the calling code.

**Triggers fallback on:**
- `5xx` server errors
- Network timeouts
- Provider unavailability
- `429` rate limit errors

**Does not trigger on:**
- `4xx` client errors (bad request, invalid API key) — these are caller bugs, not provider failures

An `onFallback` callback fires on each failed attempt, exposing the attempted model, next model, error message, and attempt number — for logging and observability.

A `FallbackExhaustedError` is thrown when all models in the chain fail, with a per-model breakdown of what failed and why.

---

### Routing Strategies

Instead of always trying the primary model first, a routing strategy determines which model in the pool to call first.

| Strategy | Description |
|---|---|
| `priority` | Try models in defined order. Default. |
| `round-robin` | Rotate starting model evenly across calls |

---

### Routing Store (Pluggable State for Strategies)

Strategies like `round-robin` need to track state (e.g. which model was last used) to work correctly across multiple calls. By default the SDK uses an in-memory store — works out of the box for single-process apps but resets on restart and does not share state across instances.

For production multi-instance or serverless deployments, users can plug in their own store via a simple `get`/`set` interface — no specific client is mandated.

- **Built-in:** `MemoryRoutingStore` (default, zero config, zero deps)
- **Bring your own:** Any store that implements the interface — Redis, Upstash, Cloudflare KV, DynamoDB, or anything else. The SDK ships the interface, the user owns the implementation.

This keeps the SDK lightweight and deployment-agnostic — no Redis client is ever bundled or required.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Fallback Model Chain & Routing Strategies for @yourgpt/llm-sdk #76

🚀 Feature Request

Problem

💡 Proposed Solution

Fallback Chain

Routing Strategies

Routing Store (Pluggable State for Strategies)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Strategy	Description
`priority`	Try models in defined order. Default.
`round-robin`	Rotate starting model evenly across calls

[Feature] Fallback Model Chain & Routing Strategies for @yourgpt/llm-sdk #76

Description

🚀 Feature Request

Problem

💡 Proposed Solution

Fallback Chain

Routing Strategies

Routing Store (Pluggable State for Strategies)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions