What's new
Streaming schema validation (#41)
All four providers now surface malformed SSE frames as SchemaDriftError and fire onSchemaDrift instead of swallowing silently. Anthropic additionally validates content_block_delta event shape and delta.text type; future tool-streaming delta types are skipped via forward-compat discriminator.
Cache-aware routing (#52)
- New
CacheHintstype —LLMRequest.cacheis a no-op for callers that don't set it - Anthropic:
strategy: 'provider-prefix'wraps the system prompt as a content block withcache_control: { type: 'ephemeral' }and marks the last tool as a breakpoint - OpenAI / Groq / Cerebras: automatic caching with no request-side translation needed
- Cached token counts normalized into
TokenUsage:cachedInputTokens,cacheReadInputTokens,cacheCreationInputTokens supportsPromptCacheflag added toModelCapabilities
Schema drift canary (#39 Part 2)
extractShape(obj)— flatpath → typemap from any response objectcompareShapes(golden, live)— diffs two shape maps into{ added, removed, changed }runCanaryCheck(provider, golden, liveResponse)— one-shot canary returning aCanaryReport- Golden fixtures committed for all five providers under
src/__tests__/fixtures/response-shapes/ - All three utilities exported from the package root
Previously merged, now documented
- Factory-level streaming with fallback (#26) —
generateResponseStreamuses the same circuit-breaker and fallback chain asgenerateResponse - Tool-use loop helper (#28) —
generateResponseWithToolswithToolLoopLimitError,ToolLoopAbortedError, iteration/cost caps, and abort-signal support - Cloudflare AI Gateway metadata forwarding (#29) —
cf-aig-*headers forwarded only whenbaseUrlmatches the Gateway pattern - Cloudflare LoRA / fine-tune forwarding (#51) —
LLMRequest.loraforwarded to Workers AI binding
Bug fixes
stop_sequenceschema false positive — was typed asstring; real Anthropic API returnsnullwhen no stop sequence triggers, causingSchemaDriftErroron every normal response. Fixed tostring-or-null.AnthropicProvider.getProviderBalance()— was calling a non-existent endpoint (/v1/organizations/cost_report). Now returnsunavailablewith a message directing users to the Admin API, matching the Groq pattern.
Full changelog
See CHANGELOG.md for the complete entry.