Skip to content

model availability pings for OpenRouter (:free models) and NVIDIA NIM profiles, plus a 429 retry middleware for generation calls#313

Merged
Pento95 merged 5 commits intoAventurasTeam:masterfrom
Pento95:master
Apr 25, 2026
Merged

model availability pings for OpenRouter (:free models) and NVIDIA NIM profiles, plus a 429 retry middleware for generation calls#313
Pento95 merged 5 commits intoAventurasTeam:masterfrom
Pento95:master

Conversation

@Pento95
Copy link
Copy Markdown
Collaborator

@Pento95 Pento95 commented Apr 20, 2026

Make room for a usable nvidia nim provider!

Model health pings

  • New pingEnabled flag per profile (opt-in, only shown for OpenRouter / NIM).
  • Periodic low-cost chat/completions probe (max_tokens: 1) classifies each model as ok / slow / down / auth / rate_limited, with latency and quota headers captured when available.
  • Results persisted in new model_health_cache table (migration 034), keyed by (provider_id, model_id, api_key_hash). API key hashed to 8-byte SHA-256 prefix.
  • Reactive SvelteMap-based store with DB hydration, concurrency-limited batch pings (5 parallel), provider-specific TTLs (auto vs manual trigger).
  • Warmup on app start: pings only models actually referenced by main narrative + generation presets, deduped by provider|baseUrl|apiKey.
  • UI: health indicator (signal icon + latency) inline in ModelSelector; ProfileWarningBanner blocks sending when the main narrative model is down or auth-failed.

429 retry middleware

  • Wraps both doGenerate and doStream with up to 3 retries (10s / 20s / 30s + jitter).
  • Respects Retry-After (delta-seconds or HTTP-date), caps at 60s to fail fast on long cooldowns.
  • Abort-aware sleep: user cancellation interrupts the backoff immediately.
  • Note: only catches pre-stream 429s (rejected doStream() promise), not mid-stream errors.

Other

  • NVIDIA NIM profiles now ship with a default hiddenModels list to hide embedding / guard / reward / vision-only / legacy models from the selector.
  • DatabaseService.init guarded against concurrent double-open via a shared pending promise.
  • updateProfile purges the health cache when apiKey or providerType changes (prevents orphan rows under stale hashes).

Files

  • Migration: src-tauri/migrations/034_model_health_cache.sql, registered in src-tauri/src/lib.rs
  • New: src/lib/services/modelHealth{Cache,Orchestrator}.ts, src/lib/stores/modelHealth.svelte.ts, src/lib/services/ai/sdk/{middleware/retryMiddleware,providers/modelPing}.ts, src/lib/constants/modelHealth.ts,
    src/lib/utils/hashApiKey.ts, src/lib/components/settings/HealthIndicator.svelte
  • Touched: generate.ts, providers/config.ts, database.ts, settings.svelte.ts, ModelSelector.svelte, ProfileForm.svelte, AgentProfiles.svelte, tabs/api-connection.svelte, ProfileWarningBanner.svelte, +page.svelte

Pento added 2 commits April 21, 2026 01:25
… profiles, plus a 429 retry middleware for generation calls
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a model health monitoring system that pings OpenRouter and NVIDIA NIM models to track availability and latency, supported by a persistent SQLite cache and reactive UI indicators. It also includes a retry middleware for 429 errors. Feedback identifies a potential SQLite parameter limit issue in batch upserts, suggests optimizing reactivity by using a standard variable instead of state for tracking previous values, and recommends respecting server-provided Retry-After headers by avoiding jitter on those specific values.

Comment thread src/lib/services/database.ts
Comment thread src/lib/components/settings/tabs/api-connection.svelte Outdated
Comment thread src/lib/services/ai/sdk/middleware/retryMiddleware.ts Outdated
@Pento95
Copy link
Copy Markdown
Collaborator Author

Pento95 commented Apr 20, 2026

still testing 429 with a few providers. help with tests is welcome

Comment thread src-tauri/migrations/034_model_health_cache.sql Outdated
Comment thread src/lib/services/ai/sdk/middleware/retryMiddleware.ts Outdated
Comment thread src/lib/services/ai/sdk/middleware/retryMiddleware.ts
Comment thread src/lib/services/ai/sdk/middleware/retryMiddleware.ts
Comment thread src/lib/services/database.ts Outdated
Pento95 and others added 2 commits April 25, 2026 19:42
Co-authored-by: Tony Frazier <16245649+a-frazier@users.noreply.github.com>
@Pento95 Pento95 merged commit 017d990 into AventurasTeam:master Apr 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants