Skip to content

Releases: Stackbilt-dev/llm-providers

v1.6.0 — SSE validation, cache hints, schema canary

27 Apr 23:23

Choose a tag to compare

What's new

Streaming schema validation (#41)

All four providers now surface malformed SSE frames as SchemaDriftError and fire onSchemaDrift instead of swallowing silently. Anthropic additionally validates content_block_delta event shape and delta.text type; future tool-streaming delta types are skipped via forward-compat discriminator.

Cache-aware routing (#52)

  • New CacheHints type — LLMRequest.cache is a no-op for callers that don't set it
  • Anthropic: strategy: 'provider-prefix' wraps the system prompt as a content block with cache_control: { type: 'ephemeral' } and marks the last tool as a breakpoint
  • OpenAI / Groq / Cerebras: automatic caching with no request-side translation needed
  • Cached token counts normalized into TokenUsage: cachedInputTokens, cacheReadInputTokens, cacheCreationInputTokens
  • supportsPromptCache flag added to ModelCapabilities

Schema drift canary (#39 Part 2)

  • extractShape(obj) — flat path → type map from any response object
  • compareShapes(golden, live) — diffs two shape maps into { added, removed, changed }
  • runCanaryCheck(provider, golden, liveResponse) — one-shot canary returning a CanaryReport
  • Golden fixtures committed for all five providers under src/__tests__/fixtures/response-shapes/
  • All three utilities exported from the package root

Previously merged, now documented

  • Factory-level streaming with fallback (#26) — generateResponseStream uses the same circuit-breaker and fallback chain as generateResponse
  • Tool-use loop helper (#28) — generateResponseWithTools with ToolLoopLimitError, ToolLoopAbortedError, iteration/cost caps, and abort-signal support
  • Cloudflare AI Gateway metadata forwarding (#29) — cf-aig-* headers forwarded only when baseUrl matches the Gateway pattern
  • Cloudflare LoRA / fine-tune forwarding (#51) — LLMRequest.lora forwarded to Workers AI binding

Bug fixes

  • stop_sequence schema false positive — was typed as string; real Anthropic API returns null when no stop sequence triggers, causing SchemaDriftError on every normal response. Fixed to string-or-null.
  • AnthropicProvider.getProviderBalance() — was calling a non-existent endpoint (/v1/organizations/cost_report). Now returns unavailable with a message directing users to the Admin API, matching the Groq pattern.

Full changelog

See CHANGELOG.md for the complete entry.

v1.5.1 — fix Cloudflare llama-3.2 vision silent empty response

27 Apr 12:05
68f6daf

Choose a tag to compare

Fixed

  • analyzeImage() silent empty response on Cloudflare@cf/meta/llama-3.2-11b-vision-instruct via the Workers AI binding requires a raw { image: number[], prompt, max_tokens } input shape, not the OpenAI-compatible messages/image_url format. The chat path returns choices[0].message.content === null via the binding, causing extractText() to silently return "". The provider now detects this model and dispatches to the raw binding format. Other vision models are unaffected. Fixes #53.

Full changelog: https://github.com/Stackbilt-dev/llm-providers/blob/main/CHANGELOG.md

v1.5.0

23 Apr 11:58

Choose a tag to compare

Consolidates the unreleased 1.4.0 scope and undocumented features into a single minor release. 1.4.0 was tagged in package.json but never published to npm; consumers upgrading from 1.3.0 receive all of the following atomically.

Added

  • Declarative model catalog (src/model-catalog.ts) — semantic catalog for provider/model metadata, recommendation use cases, lifecycle status, and runtime scoring
  • Runtime recommendation APILLMProviders#getRecommendedModel(request, useCase?) exposes the same routing logic the factory uses internally
  • Schema drift envelope validationOpenAIProvider, GroqProvider, CerebrasProvider, and AnthropicProvider now validate response envelopes at the provider boundary, throwing SchemaDriftError on mismatch instead of corrupting downstream consumers silently
  • LLMProviders.fromEnv() static factory — auto-discovers providers from Cloudflare Workers env bindings without manual wiring
  • Model drift test — asserts every provider's models[] is symmetrically covered by its capabilities map
  • Catalog tests — coverage for retired-model exclusion, health-aware ranking, request-shape use-case inference

Changed

  • Factory routing selects provider/model pairs from the catalog instead of hardcoded ordering
  • Health-aware dispatch considers circuit-breaker state including degraded and recovering providers, not just fully open
  • Budget-aware dispatch — with a CreditLedger attached, selection demotes providers under high utilization or near projected depletion
  • Provider defaults for OpenAI, Anthropic, Cloudflare, Cerebras, and Groq resolve through the shared catalog
  • Cloudflare model recommendation prefers modern active baselines (Gemma 4, GPT-OSS) instead of legacy TinyLlama/Qwen heuristics
  • Recommendation exports exclude retired targets (e.g. gpt-4o) while preserving deprecated constants for compatibility

Deprecated

  • MODELS.CLAUDE_3_HAIKU — migrate to CLAUDE_HAIKU_4_5 or CLAUDE_3_5_HAIKU
  • MODELS.GPT_4O — migrate to GPT_4O_MINI or a current GPT-4 successor

Removed

  • claude-3-haiku-20240307, gpt-4o, and dead alias gpt-4-turbo-preview dropped from provider models[] and capabilities tables. Arbitrary-string passthrough on request inputs is unchanged — consumers pinning older MODELS enum values via string literals are not affected.

Full changelog: CHANGELOG.md

v1.3.0 — Cloudflare Workers AI vision support

17 Apr 00:05
bf713cd

Choose a tag to compare

Added

  • Cloudflare Workers AI vision supportCloudflareProvider now accepts request.images and routes to vision-capable models. Previously image data was silently dropped on the CF path.
  • Three new CF vision models:
    • @cf/google/gemma-4-26b-a4b-it — 256K context, vision + function calling + reasoning
    • @cf/meta/llama-4-scout-17b-16e-instruct — natively multimodal, tool calling
    • @cf/meta/llama-3.2-11b-vision-instruct — image understanding
  • CloudflareProvider.supportsVision = true — factory's analyzeImage now dispatches to CF when configured.
  • Factory default vision fallbackgetDefaultVisionModel() falls back to @cf/google/gemma-4-26b-a4b-it when neither Anthropic nor OpenAI is configured, enabling CF-only deployments to use analyzeImage().

Changed

  • Images are passed to CF using the OpenAI-compatible image_url content-part shape (base64 data URIs). HTTP image URLs throw a helpful ConfigurationError — fetch the image and pass bytes in image.data.
  • Attempting request.images on a non-vision CF model throws a ConfigurationError naming the vision-capable alternatives.

Usage

factory.analyzeImage({
  image: { data: base64, mimeType: 'image/jpeg' },
  prompt: 'Extract recipe data',
  model: '@cf/google/gemma-4-26b-a4b-it',
});

See #43 for details.

v1.1.0 — Multi-Modal: Image Generation

01 Apr 15:54

Choose a tag to compare

Image Generation Provider

@stackbilt/llm-providers is now multi-modal — text + image inference under one package.

New: ImageProvider

import { ImageProvider } from '@stackbilt/llm-providers';

const img = new ImageProvider({
  cloudflareAi: env.AI,
  geminiApiKey: env.GEMINI_API_KEY,
});

const result = await img.generateImage({
  prompt: 'a mountain landscape at sunset',
  model: 'flux-dev',
});
// result.image: ArrayBuffer, result.responseTime, result.provider

Built-in Models

Model Provider Use Case
sdxl-lightning Cloudflare Fast drafts, free tier
flux-klein Cloudflare Balanced quality/speed
flux-dev Cloudflare Highest CF quality
gemini-flash-image Google Text rendering capable
gemini-flash-image-preview Google Latest preview model

Extracted from img-forge production codebase. Battle-tested response normalization handles all Workers AI return formats.

Full changelog: CHANGELOG.md

v1.0.0 — Production Release

01 Apr 14:12

Choose a tag to compare

First stable release. Production-tested in AEGIS cognitive kernel since v1.72.0.

Highlights

  • Zero runtime dependencies — supply chain security by design
  • 5 providers: OpenAI, Anthropic, Cloudflare Workers AI, Cerebras, Groq
  • LLMProviders.fromEnv() — one-line multi-provider setup
  • Graduated circuit breakers — automatic failover with half-open probe recovery
  • CreditLedger — per-provider budget tracking with threshold alerts + burn rate projection
  • npm provenance — every version cryptographically linked to its source commit

Install

npm install @stackbilt/llm-providers

Quick Start

import { LLMProviders } from '@stackbilt/llm-providers';

const llm = LLMProviders.fromEnv(process.env);
const response = await llm.generateResponse({
  messages: [{ role: 'user', content: 'Hello!' }],
});

See README for full documentation.
See SECURITY.md for supply chain security policy.