Skip to content

fix: harden completion model migration tenant safety and error handling#240

Open
CCimen wants to merge 1 commit intodevelopfrom
fix/harden-completion-model-migration
Open

fix: harden completion model migration tenant safety and error handling#240
CCimen wants to merge 1 commit intodevelopfrom
fix/harden-completion-model-migration

Conversation

@CCimen
Copy link
Copy Markdown
Contributor

@CCimen CCimen commented Feb 20, 2026

Changes

  • Fix tenant filter in migration service: entity type names were singular ("app", "assistant") but
    ENTITY_TABLE_MAP uses plural keys ("apps", "assistants"), causing the filter to fall through
    to an unscoped return True — meaning no tenant filtering was applied on any migration
  • Fix transaction conflict in migrate-all-tenants endpoint: removed session.begin() that conflicted
    with the already-begun request-scoped session, causing "A transaction is already begun" for all tenants
  • Add per-tenant model resolution for migrate-all-tenants: resolves source/target models by
    (name, family) per tenant instead of passing the same UUID to all tenants (which fails because
    models are tenant-scoped with different UUIDs)
  • Fix exception mapping in both sysadmin migration endpoints: HTTPException and ValidationException
    are no longer swallowed by catch-all and re-raised as 500
  • Allow disabled/deprecated source model in migration — only the target model must be enabled
  • Make kwargs reset warning informational instead of a blocking compatibility issue that forced
    confirm_migration=true on every migration
  • Reset completion_model_kwargs to {} for all entity types that store them (apps, services,
    assistant_templates, app_templates), not just assistants
  • Scope template migration by tenant_id instead of unscoped (return True)
  • Add deterministic ordering (is_enabled DESC, updated_at DESC) to per-tenant model resolution

Why

The model migration endpoints had several bugs discovered during a production migration
from Claude Sonnet 3.7 to Claude Haiku 4.5:

  1. The migrate-all-tenants endpoint failed for all tenants with "A transaction is already begun on this Session"
  2. The tenant filter in the migration service never actually filtered by tenant due to singular/plural mismatch
  3. The per-tenant endpoint wrapped all exceptions (including 400/404) as 500 errors
  4. Migrating away from a disabled model was blocked, even though that's the whole point of migration
  5. Every migration required confirm_migration=true because an informational kwargs warning was treated as a blocking issue
  6. Only assistants had their completion_model_kwargs reset — apps, services, and templates kept stale params
  7. The all-tenants endpoint passed the same model UUID to every tenant, but models are tenant-scoped

…andling

- Fix critical tenant filter bug: singular entity names ("app", "assistant")
  never matched plural keys from ENTITY_TABLE_MAP, causing unscoped updates
- Fix transaction conflict in migrate-all-tenants: remove nested begin() that
  conflicted with already-begun session
- Add per-tenant model resolution for all-tenants migration using (name, family)
  canonical matching instead of passing same UUID to all tenants
- Fix exception mapping: preserve HTTPException status codes instead of wrapping
  all errors as 500
- Allow disabled/deprecated source model in migration (only target must be enabled)
- Make kwargs reset warning informational instead of always forcing confirm_migration
- Reset completion_model_kwargs for all entity types (apps, services, templates),
  not just assistants
- Scope template migration to tenant (templates have tenant_id column)
- Add deterministic ordering to per-tenant model resolution
@CCimen CCimen added the bug Something isn't working label Feb 20, 2026
@CCimen
Copy link
Copy Markdown
Contributor Author

CCimen commented Feb 20, 2026

Follow-up: Test coverage needed

There are currently no tests targeting the migration endpoints, particularly:

  • POST /api/v1/sysadmin/tenants/{tenant_id}/completion-models/{model_id}/migrate
  • POST /api/v1/sysadmin/system/completion-models/{model_id}/migrate-all-tenants

We should add unit/integration tests to confirm the fixes in this PR:

  1. Tenant filter scoping — verify migrations only affect the targeted tenant's entities
  2. Per-tenant model resolution — verify (name, family) lookup resolves correct IDs per tenant, and handles missing mappings gracefully
  3. Transaction safety — verify all-tenants endpoint no longer throws "A transaction is already begun"
  4. Exception mapping — verify ValidationException returns 400, HTTPException preserves its status code
  5. Source model validation — verify disabled/deprecated source models are accepted
  6. confirm_migration behavior — verify compatible migrations succeed without confirm_migration=true
  7. kwargs reset — verify completion_model_kwargs is reset to {} for all entity types (assistants, apps, services, templates)
  8. Template tenant scoping — verify only tenant-owned templates are migrated

@CCimen
Copy link
Copy Markdown
Contributor Author

CCimen commented Feb 20, 2026

Fixes #241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant